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A. Application of Statistical Pattern Recognition 
to Image Interpretation* 


1. INTRODUCTION 
1 . 1 Background 

Analysis of remotely sensed agricultural crop survey data by 
pattern recognition algorithms requires the availability of training 
samples (data of known classification) . In large-scale Landsat crop 
surveys, training samples cannot be acquired solely by ground observa- 
tions, due either to cost considerations or to inaccessibility of the 
survey site. For both of these reasons, the labeling of training samples 
based on interpretation of the Landsat data and associated ancillary 
data has been utilized in LAGIE, in which the manual image interpretation 
process has been supported by meteorological data and historical agronomic 
data. Although the performance of the analyst-interpreters (AIs) in 
LACIE has apparently been adequate to support the project goals, it is 
widely recognized that the labeling process, implemented in this manner; 
Involves a great deal of subjective judgement, and hence the accuracy 
and precision of the results can vary greatly from one AI to the next. 

The overall objective of this task has been to Investigate ways to 
upgrade the objectivity and reliability of the image labeling process. 

The basic approach proposed involved introduction of quantitative methods, 
often related to pattern recognition, in place of subjective judgement 
wherever possible. At the outset, it was hoped that it might be possible 
to develop a completely machine-implemented labeling method. 


* This report covers work under Task 2.2a Application of Statistical 
Pattern Recognition to Image Interpretation. The report was compiled 
by Philip H. Swain. 
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1.2 Overview of Previous Work 


Training sample labeling by manual interpretation of the Landsat 
imagery was a fundamental assumption of the LACIE approach. Initially 
individuals were required to select fields for classifier training by 
visually locating and outlining agricultural fields which appeared to 
be representative of all spectrally distinguishable ground covers. The 
selection process included identifying the ground cover as wheat or 
non-wheat. As loosely defined as this, the process was not effective, 
because the interpreter could not discern all significant variations In 
the image products provided and, furthermore, tended to be biased 
toward the selection of homogeneous and clearly delineated fields. As 
a result, significant spectral categories often were overlooked and 
classifier performance suffered accordingly. 

The appropriate goal in classifier training is to sample the 
measurement space (spectral or spectro-temporal space) adequately to 
obtain a representative sample of the data to be classified. In order 
to be representative, the sample must include, at minimum, observations 
from every class of interest and every class which might be confused 
with a class of interest. Given no information about the distribution 
of data in the measurement space, the optimal strategy for obtaining 
representative training data would be to select a random sample. After 
selection, of course, there remains the task of labeling the selected 
observations to identify their ground cover classes. 

To reduce interpreter bias and improve the probability of getting 
a representative sample for classifier training, the AIs were later 
required to label pixels which had been randomly selected from the 
segment. An assumption implicit in this approach is that a selection 
based on random location in the image will induce a random selection 
from the measurement space, an assumption that appears to be sound. 
Since the sample size was quite small (less than 100 pixels per segment 
out of more than 20,000 to be classified), the probability of missing 
spectral classes was still significant. Nonetheless, this sample 
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selection method, incorporated into a generally more robust analysis 
procedure, resulted In improved cl asslFtcation results Ln later phases 
or LACTK. 

With the analyst-subjective factor removed from sample selection, 
there remained considerable subjectivity in the labeling (ground cover 
identification) process, the problem specifically addressed by this 
investigation. To proceduralize the labeling process, a questionnaire 
containing segment-related and pixel-related questions was formulated 
at JSC to lead the AI systematically through the available image data 
and supporting data [l]. To the extent possible, based on exploratory 
work to date, the supporting data Included quantitative aids, including 
spectrally "normalized" Imagery [2] and temporal greenness/brightness 
trajectories for each pixel to be labeled [3], This labeling method was 
called Label Identification from Statistical Tabulations (LIST) . 

Despite the availability of spectral aids, however, it remained for 
the AI to make a subjective integration of the evidence to produce the 
necessary set of cover type labels. This process remained tedious 
and subject to a great deal of analyst-to-analyst variability. An 
effort to improve this situation was mounted, in which some of the key 
questions were made more quantitative and an attempt was made to automate 
them [4]. The results were promising, although the reported experiments 
were carried out over too limited an area (two LACIE segments in North 
Dakota) to permit general conclusions. Nonetheless, this represented 
another positive step in the direction of making the derivation- of training 
data more objective. 

1.3 Objectives and General Approach 

As noted previously, the overall objective of this investigation has 
been to improve the objectivity and reliability of the image labeling 
process in order to provide classifier training samples in the absence 
of ground truth. Our approach may be described in terms of three subob- 
jectives, outlined below. 
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Analysis of the Current LIST Process . This required selection and 
acquisition of appropriate data, implementation of the process, and 
assessment of the labeling results produced by applying the process to 
the data. 

Investigation of Possible Methods for Machine Implementation of 
the Labeling Process . The starting point was preliminary work reported 
in [ 4 ], to be implemented and applied to in-house data for comparison 
with results achieved by AIs. 

Extension to more General Applications . It was originally planned 
to develop and test an extension of the LIST process to crop inventory 
Involving corn, soybeans and other major crops. It was subsequently 
decided by LARS and JSC to concentrate all resources on the wheat 
inventory applications. A few comments on multicrop extensions, based 
on our experience with wheat, will be included near the conclusion of 
this report. 

1.4 Overview of Accomplishments 

The LIST process was implemented and applied to a total of 13 
LACIE segments in Kansas, North Dakota and South Dakota. This permitted 
LARS personnel to gain insightful familiarity with the process and 
provided a data base for accomplishing the project objectives. 

A number of weaknesses in the current LIST process were pinpointed. 
The length and tedium of the process adversely affect the attainable 
results. The analysts were able to suggest specific ways to make the 
use of LIST more efficient and even formulated an alternative question- 
naire as a step in .this direction. But they also recommended that 
analysts should have knowledge of and exposure to the wheat growing 
process if they are to be able to perform the AI role in an optimal 
fashion. 



That role Is still a very subjective one, however, the AI being 
expected to integrate diverse forms of information into the labeling 
process in ways that are not very quantitative. There are real possibi- 
lities for Improvement, because we have shown that the process depends 
most critically on a few key features in the data which apparently -can 
be quantified. Experimental results involving data from seven Kansas 
segments showed that a simple but completely computerized labeling 
method based on these features could perform at least as well as the 
AIs using the full LIST process. 

These results may be used to advantage either by (1) replacing the 
AT and LIST by a faster and possibly less expensive machine- imp lamented 
labeling process of equal capability, or (2) providing the AI with the 
quantitative results to be used as an aid in obtaining still better 
results through integration of other forms of information. 

Further research is required to establish the viability of the 
latter strategy. However, it seems clear that in the near term, in which 
multicrop extensions of the present technology are sought, the increased 
difficulty of the labeling task will require continued use of the AI as 
an active agent to bring together diverse sources of information which 
can contribute to accurate labeling. 

Finally, it is important to recognize a fundamental limitation of 
the investigation reported here. The LIST process calls for data from 
four strategically timed acquisitions of the primary multispectral data, 
and the data base used in this study was selected to meet this requirement. 
Although the impact of poorly timed or missing acquisitions was not 
specifically considered, that Impact clearly can be substantial. Further 
research will be required to minimize the sensitivity of the LIST process 
to less-than-ideal data acquisition. 
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1 . DESCRIPTION OF THE RESEARCH 

This investigation consisted of two distinct, though related, 
components. The first component, analysis of the LIST process, required 
implementation and use of the labeling process in order to gain insightful 
familiarity with it and to accumulate data with respect to both the results 
it could produce and how it produced them. We intended to assess both the 
strengths and weaknesses of the implemented LIST method, and determine, 
if possible, how improvements in the objectivity and reliability of the 
labeling process might be achieved. 

2,1 Analysis of the LIST Process 

Data Set Assembly . 

In support of this task, a comprehensive data set was assembled based 
on multitemporal Landsat data for 13 LACIE segments (Table A-1) . This 
data base consisted of a wide variety of types of Information that were 
necessary for the AI to use in answering the LIST questions and ultimately 
labeling training samples. The primary data were in the form of five- 
inch positive transparencies of the Landsat data for each of four or more 
acquisition dates in each of the 13 segments. These transparencies were 
Production Film Convertor (PFC) products supplied by JSC in roll form. 

The available segments were visually screened by an AI and the segments 
and acquisition dates were selected based on the following criteria: 

1. 1976 segments should be from the prime wheat producing states, 
Kansas and North Dakota; 

2. 1977 segments should be from Central Plains states where winter/ 
spring wheat is a major crop; 

3. Landsat data must be available for four acquisitions, each 
during a time period corresponding to a significant biostage 
or growth development stage of the wheat crop; 

4. The data must be largely free of clouds and haze. 

The locations of the 13 LACIE segments selected based on these 
criteria are shown in Figure A-1. 
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Table A-1. LACIE Segments Analyzed at LARS by the LIST Method 


State 

LACIE Segment Number 

County 

Growing Year 

Kansas 

1163 

Coffey 

1976 


1165 

Linn 



1852 

Lane 



1855 

Trego 



1857 

Grant 



1860 

Hodgeman 



1865 

Stevens 


N. Dakota 

1633 

Foster 

1976 


1637 

Stutsman 



1661 

McIntosh 



1652 

Stark 

1977 


1897 

McHenry 


S. Dakota 

1681 

Roberts 

1977 


Once the film products for the 13 segments were assembled, they were 
photographically enlarged at LARS to an 8" x 10" paper print. The 
enlargements made it much easier for the analyst to locate and evaluate 
the individual pixels of Interest. 


An additional film product, supplied by JSC as part of the supporting 
data, was a supplemental color product (Kraus product) [2], This photo- 
graphic product, also enlarged and printed at LARS, was similar to the 
false color product discussed above, but its colors were "normalized" to 
produce an image in which a similar hue of redness was expected to always 
indicate a similar amount of green biomass and possible degree of crop 
development. 

A second type of data assembled in support of this task Included tables 
and summaries regarding weather and crop conditions and historical wheat 
yield and development patterns. Specific items in this category were: 




- U.S. and Canada Meteorological Summaries of precipitation, freeze 
dates, crop development, and disease and insect infestation; 

- Universal Strata Descriptors of climate, soil conditions, agricul- 
tural practices, and other crop-related variables; 

- LACIE County-Level Historical Agricultural Statistics listing per- 
cent of agricultural lands within counties having LACIE segments 
and estimates of each crop' type harvested in that county over the 
past two years; 

- Wheat Yield Information for each county in each of the wheat 
growing states of the U.S.; 

- Crop Calendar for each crop reporting district (CRD) in the wheat- 
producing states showing the onset and completion of each biostage 
for each of the crops grown within that district. 

All of these data items were thought to be of assistance in enabling 
the AIs to label the segment training samples using the LIST procedure. 

It should be noted, however, that these materials came from diverse' 
sources and were therefore not compiled or designed to be the most 
accessible or convenient in format for use by the AIs. Valuable data had 
to be separated from extraneous sections of other information within 
each set of data. This was not only time-consuming but. also a non-produc- 
tive activity for the AIs. 

In addition to the photographic materials, tables and summaries, 
there were generated at LARS some quantitative analysis aids to support 
the LIST analysis. These Included trajectory plots of greenness values 
versus acquisition date for each pixel to be labeled, and generalized 
"typical" trajectory plots for wheat in Kansas and North Dakota. Examples 
of these plots are given in Figure A- 2. The typical plots were utilized 
by the ,AIs in forming a mental trajectory image to compare against when 
encountering the plot of each training sample to be labeled. It was 
found that variations in acquisition dates and agricultural conditions 
from segment to segment made straight correlations between the "typical" 
and sample plots to be unrealistic. 

Another quantitative aid examined for implementation for this task 
was an automatic screening of the segment Landsat data in order to locate 
features in the data of a distinctly non-agricultural nature. This 



greenness greenness 
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Segment = 1857 

A: Acquisition of 76073 C: Acquisition of 76154 

B: Acquisition of 76136 D: Acquisition of 76190 



50 70 90 50 70 90 

brightness brightness 

(c) .(d) 


Figure A-2. a) Trajectory Plot for a Wheat Sample in Kansas, h) Trajec- 
tory Plot for a Non-wheat Sample (Corn) in Kansas, c) "TjTsical" 
Trajectory Plot for Wheat in Kansas, d) "Typical" Trajectory 
Plot for Wheat in North Dakota. 



technique, developed' at BRIM, involved delineating the "designated 
other" (DO) areas (water, woods, urban areas) and "designated unidenti- 
fiable" (DU) (clouds, cloud shadows, haze, snow, flooded areas) by 
applying thresholds to the data and printing the resultant maps [5] . 

The results were judged by the analysts to be- unsatisfactory due to 
the appearance of small scattered areas of "bad data" or"'shadow" which 
did not agree with visual examination of the film products. The BRIM 
documentation Included a warning that the algorithm is very sensitive 

t 

to the threshold settings, so the areas where errors occurred were 
examined and the thresholds adjusted to eliminate these errors. With 
the new thresholds a different problem occurred; areas of actual shadow 
or water were not completely delineated. Examination of the- data values 
in problem areas to determine optimal thresholds revealed that the 
thresholds which minimized the error on segment 1633 would not minimize 
the error on segment 1637 and that the overall error occurrence for 
segment 1637 could never be as low as on segment 1633. It was judged 
that the automatic screening procedures would not produce a gain in 
accuracy or saving of analyst time in delineating the DO and DU areas. 

As a result the AIs screened the false color images visually to locate 
the readily recognizable DO a nd DU areas. 

The final type of data to be included in the data base was a 
complete set of ground truth information for the 13 segments to be 
labeled. The ground truth was initially available as photo- interpreted 
blue print copies of high-altitude aerial photography. However, this 
data was later replaced by Universal format data tapes in which every 
Landsat pixel had been labeled as one of 101 possible crop types or 
conditions. By accessing these tapes using the EOD-LARSYS $HIST and 
$GRAYMAP processors, detailed maps were generated which could be used 
to evaluate analyst labeling performance. The data base was subsequently 
augmented to include files containing both the ground truth and the 
analyst labels for the specific pixels labeled by the AIs. These files 
were utilized to evaluate the LIST method of labeling. This portion 
of the task will be discussed in greater detail in a later section of 
this report. 
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In summary, a valuable data base was assembled for 13 LACIE segments 
in support of evaluating the LIST procedure. Compilation of the data 
base involved acquiring, handling, storing, and accessing a great variety 
of data types. It is felt that this data base is sufficiently extensive 
and well- documented to be incorporated in further studies requiring a 
data base of this type. Application will be made to have the data 
placed in the public domain. 

Implementation of the LIST Process . 

Three analysts were assigned the task of applying the LIST process 
to the thirteen LACIE segments described above. Two objectives were 
Involved: to make the analysts as familiar as possible with the process 

in order to provide a means of evaluating the process subjectively; and 
to develop a data base to be used for evaluating the process objectively. 

Two of the analysts were graduate students in electrical engineering 
having some experience in digital analysis of multispectral imagery. 

The third analyst was a geologist with extensive remote sensing experience. 
Each analyzed the data as independently as possible given their day-to- 
day proximity in the LARS environment. All three completed labeling of 
the seven Kansas segments; two completed thirteen segments. 

Effective implementation of the LIST process at LARS required 
familiarity with the Universal Format for data tapes and conversion to 
LARSYS format (the conversion facilitated computation of statistics 
using existing software). Computer programs were also developed for 
generation of greenness trajectory plots for the pixels to be labeled. 

In general, these pixels were the seventy random "dots" specified by the 
LACIE Phase III type 1 and type 2 overlays [6]. 

Having labeled the seven Kansas segments, the analysts felt 
sufficiently experienced to evaluate the LIST questionnaire. Some 
questions were found too general to be of real value. Other questions 
were ambiguous and could not be interpreted. Some were judged not help- 
ful to the decision process. In some instances, it was -felt that the 



addition of one or two questions would contribute to a better judgement. 
Thus, a revised LIST questionnaire was formulated by the analysts. Both 
the original and revised questionnaires appear in the Appendices, The 
revised questionnaire was used for processing the segments not located in 
Kansas . 

The following changes were recommended: 

1. Questions 19-23 and Question 25 were thought to be too general 
to apply to specific pixels and were not considered helpful. They were 
merged, therefore, into Question 19 in the new list, in which the analyst 
is asked to give an overall evaluation of the crop condition based on the 
meteorological data available. 

2. Question 32 was modified, becoming Question 27 in the new list. 
It was the analysts' experience that a pixel might still be fallow even 
though there were some indications of vegetation in it. Hence, the new 
question asks the analyst to decide on the pixel based on an overall 
judgement rather than on the vegetation indication alone. 

3. A new question was inserted between Questions 33 and 34 in the 
old questionnaire (29 in the new list). It requires the analyst to 
determine whether fallowing is practiced in the segment. This helps the 
analyst decide whether the pixel is a fallow or a non-agricultural pixel. 

4. Question 39 in the old questionnaire was reworded to Question 34 
in the new list. The phrase "all available data" should be emphasized 

as the analyst might otherwise base judgement on only the false color 
imagery. 

5. Two questions were inserted between Questions 39 and 40 in the 
old questionnaire (new Questions 35 and 36) . These require the analyst 
to consider whether the pixel is representative of the field it is in. 

If it is not, the next question checks to see if the field containing 
that pixel follows a small grain development pattern. The idea is to 
label the pixel according to its field in case it is not representative 
of the field. 

6. Questions 40 and 41 in the old questionnaire were merged into 
Question 37 in the new. The two old questions basically have the same 
meaning and a single question, instead was felt to be adequate. 
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7. Question 43 in fhe old questionnaire was deleted. It does 
not contribute to the decision and serves only to add confusion, as no 
guidance is given as to how closely or in what manner the percentages 
must match to motivate a particular choice of answer. 

8, It was felt that Questions 45—51 in the old questionnaire must 

be altered in some way to be of any value to the analyst in distinguishing 
wheat from other small grains. However, the analysts did not feel 
sufficiently knowledgeable of the wheat-growing process to suggest an 
appropriate alteration. 

A number of additional points related to the LIST processing and 
the analysts’ experience with it are pertinent to the evaluation. These 
are summarized as follows: 

1. The ’’Kraus Product" PFC imagery is intended to provide "norma- 
lized" color as a basis for acquisition-to-acquisition comparisons of 
Imagery. However, in many cases, the analysts did not feel confident 
that the "redness" of a field in the Kraus product could be taken as a 
reliable quantitative indication of the vegetative state or quality of 
the field.- This was particularly true in the 1976 data; the Kraus 
products were thought to be more reliable in the 1977 data. The analysts 
felt that the "Product 1" Imagery was still” the most interpretable and 
information-bearing . 

> 

2. Interpretation keys were not made available by JSC. Consequently, 
the answers to Question 34 were necessarily very subjective, and, though 
this did not create a serious problem, the need for the keys was definitely 
felt in some cases. 

3. The available meterologlcal data was found to be incomplete and, 
in many Instances, not helpful. The data on 3-days-prior precipitation 
(Question 18) was not provided for 1976, although it was provided for 
1977. In any case, the Information on prior precipitation was found too 
general to be of significant value. The data was based’ on the city 
nearest to the county containing the segment being labeled. The nearest 
city almost always proved to be too distant to provide reliable Informa- 
,tion about the segment itself. 
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4. The analysts felt that accurate crop calendar Information was 
absolutely essential in assessing the wheat growing stage. 

5. The "typical" trajectory plots were found inadequate for precise 
numerical comparison wJth trajectory piots of pixels to bn labeled. 

The shape of the plots, however, was clearly helpful, but the Judgement 
of the analysts as, to whether a pixel followed a small grains trajectory 
plot was very subjective. In many instances, the analysts were confronted 
with pixel trajectory plots that were ambiguous and could be interpreted 
different ways. 

6. In the original LIST questionnaire, the analysts were directed 
to omit answering a number of questions when a pixel was temporally 
misregistered. However, a labeling decision was still sought. The 
analysts felt strongly that these pixels should be deleted from consider- 
ation altogether due to the unavailability of the important trajectory 
plot information. Certainly, the reliability and utility of data under 
such circumstances is doubtful. 

7 . Data could not be automatically screened for DO and DU prior to 
analysis because the available screening process was found to be ineffec- 
tive. ■ Thus, the analysts had all to agree jointly on DO and DU areas in 

a given segment. 

8. A very serious problem encountered by the analysts was the 
"unanswerability" of Questions 45-51. The analysts felt strongly that, 
with the given information, they could not discriminate between wheat 
and other small grains. Furthermore, some of these questions (46, 47) 
are unclear. For all cases in which there were small grains other than 
wheat in the segment, the analysts were unable to discriminate the wheat. 

9. In some cases the acquisition dates provided were not "typical" 
in the sense that they did not adequately cover the different stages of 
wheat growth. The analyst had to settle sometimes for whatever dates 
were available, some of which may have been little information-bearing. 
Further, because of these temporal shifts, the pixel trajectory plot shapes 
were altered, and the analyst had to interpolate mentally to decide 
whether a pixel trajectory plot was similar to that of small grains. 

The analysts felt that a significant handicap was their inexperience 
with the wheat growing process and that such a background would definitely 



contribute to better understanding of the LIST process and the rationale 
behind the questions. This could lead to better labeling accuracies. 
Another factor mentioned was the length and tediousness of the process - 
The analysts felt that an effort should be directed at automating at 
least a part of the process, although they also observed that the analyst’s 
role is of such importance that they doubt whether the process can be 
completely automated. 

Labeling Performance . JSC-supplied ground truth tapes were used to 
evaluate the labeling results. Using these ground truth tapes and the 
EOD-LAB.SYS program $GRAYMAP, a map of each segment was generated. These 
maps identified the fields in the segment doxm to the subpixel level, 
each pixel divided into six subpixels. The ground cover classes were 
grouped into the following categories: wheat, small grains, non-small 

grains, fallow, and non-ag. The pixels were then assigned a ground truth 
label as follows: if all subpixels were of one class (e.g., wheat,, 

fallow, etc.'), the ground truth label was that class; if the pixel was 
partially wheat or small grains and partially one of the non-small 
grain types, the pixel was labeled edge point. 

The analyst-labeled pixels were then compared to the corresponding 
pixels on the ground truth map. The LIST questionnaire limited analyst 
labels to the following categories: wheat, small grains,, non-small 

grains, fallow, non-ag and edge. Each pixel label was called correct 
or not according to the following rules: 

1. If the analyst answered wheat or small grains and the ground 
truth label also was wheat or small grains, the answer was considered 
correct. 

2, If the analyst's answer was any non-small grains category and 
the ground truth label was also, the answer was considered correct. 

3, If the ground truth label for the pixel was edge, the pixel 
was disregarded,, since it was partially small grains and partially 
non-small grains. 

4. Anything else was considered an error. 



The accuracy of each analyst was then found by dividing the number 
of correctly labeled pixels by the total number of pixels labeled 
(disregarding edge pixels). 

The labeling results for seven Kansas, one South Dakota, and five 
North Dakota segments are plotted in Figures A-3 to A-5. The accuracy 
figures for each analyst and each segment are shown in Table A-2. The 
general trend of these results suggests that which segment is being 
analyzed has more influence on the results than which analyst is 
producing the results. There is further evidence to support this 
conclusion. Table A-3 characterizes the major sources of error for 
the segments processed by the analysts. For the most part, although 
the t 3 T>es of errors vary from segment to segment, all analysts tended 
to make the same type of error on a given segment. 

A more complete study of the analyst and segment effect was provided 
by an analysis of variance of the results. To test the significance of 
the effects of the analyst, the segment, and the analyst x segment 
interaction, an analysls-of-variance of the results was done using the 
SPSS ANOVA procedure. The ANOVA results produced are shown in Table A— 4 . 
A qualitative look at the results shows that analyst effects and analyst 
X segment interaction effects are not significant; however, segment 
effects are significant. 

The major points to note are: 

1. The area being analyzed has an effect on the labeling accuracy 
of the analyst. Note the much lower accuracy for the Dakota states as 
compared to Kansas. Also note the segment- to-segment variation in 
accuracy. This effect may be due to: 

- Cropping practices that vary from area to area (e.g., strip 
cropping, contour farming, irrigation). 

- Confusion crops grown in a particular area (e.g., hay and 
pastureland are sometimes difficult to differentiate from wheat) . 

- Unusual growth patterns for a given area (e.g., late planting, 
effects of drought or disease, early harvest) . 
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■ 1652 ■ 1681 1897 

segment 


Figure A-5. Small Grains/Non— Small Grains Accuracv vs. Seement: 



Table A-2. Labeling Accuracy 
Kansas (1976 data) 


By Analyst 


By Segment 


Analyst 

Accuracy 

Segment 

Accuracy 

1 

80.7 % 

1857 

80.8 % 

2 

84.2 % 

1860 

64.6 % 

3 

77.5 % 

1865 

79.0 % 

Overall 

80.8 % 

1163 

83.1 % 



1165 

92.7 % 



1852 

90.5 % 



1855 

75.0 % 



Overall 

80.8 % 


North 

Dakota (1976 data) 


By Analyst 


By Segment 


Analyst 

accuracy 

S egment 

Accuracy 

1 

72.0 Z 

1633 

66.2 % 

2 

69.1 % 

1637 

86.0 % 

Overall 

70.5 % 

1661 

59.5 % 



Overall 

70.5 % 


North and South Dakota (1977 data) 


By Analyst 


By Segment 


Analyst 

Accuracy 

Segment 

Accuracy 

1 

67.9 % 

1652 

67.1 % 

2 

72.0 % 

1681 

68.1 % 

Overall 

70.0 % 

1897 

74.8 % 



Overall 

70.0 % 


Summary: Analyst 1 average = 75.7 % 

Analyst 2 average = 77.8 % 
Analyst 3 average = 77.3 % 
Overall average = 77.0 % 
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Table A~3. Major Contributions to Labeling Error 


Analyst/ 

Segment- 

1 

2 

3 

1857 

■Wheat low 
Fallow high 

tJheat low 
Fallow high 

Wheat low 
Fallow high 

1860 

Wheat high 
Fallow low 

Wheat/Non-small- 
grains confusion 

Wheat high 
Fallow low 

1865 

Small grains high 
Fallow low 

Small grains high 
Fallow low 

i 

Small grains high 
Fallow low 

1163 

i 

Wheat/Non-small 
grains confusion | 
Fallow high 

Wheat low 
Fallow high 

Wheat high, Edge 
high. Fallow low 

1165 

Fallow high 

Fallow high 

Small grains high. 
Fallow high. 

Edge high 

.1852 

T'Jheat high 
Fallow high 

Fallow high 

Wheat high 
Fallow high 

1855 

Wheat high 

Wheat/Non-small 
1 grains confusion, 
Fallow high 

Wheat high 
Fallow high 

1633 

Small grains high 

Small grains high 


1637 

Small grains high 

Small grains high 


1661 

Small grains/Non- 
small grains 
confusion 

Small grains /Non- 
small grains 
confusion 


■ 1652 

Small gralns/Non- 
small grains 
confusion 

Small grains /Non- 
small grains 
confusion 




Small grains low 


Small grains low 


Small gralns/Non- 
small grains 
confusion, Fallow 
low 


Small grains/Non- 
small grains 
confusion. Fallow 
low 








^5 
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Table A-4 . Anova Tables for Kansas 'and Dakotas 


Kansas (1976 Data) 


Source 

df 


MS 

F 

Analyst 

2 

153.859 

76.930 

1.369 

Segment 

6 

1623.805 

270.634 

4.817* 

Anal X Seg 

1 

98.632 

98.632 

1.756 

Error 

11 

618.014 

56.183 

— 

North Dakota 
Source 

(1976 Data) 
df 

SS 

MS 

F 






Analyst 

1 

12.615 

12.615 

7.994 

Segment 

2 

756.372 

378.186 

239.662* 

Anal X Seg 

1 

2.45 

2.45 

1.553 

Error 

1 

1.578 

1.578 

— 

Dakotas (1977 
Source 

Data) 

df 

SS 

MS 

F 


" 

MW 



Analyst 

1 

25.216 

25.216 

.504 

Segment 

2 

70.505 

35.253 

.705 

Anal X Seg 

1 

.04 

.04 

.001 

Error 

1 

50.029 

50.029 

— 


* significant at the a = .05 level 



2. Analyst effects in this study were not significant. However, 
we cannot conclude that this would be true in general, since all three 
analysts had similar amounts of related experience and knowledge of 
wheat growth. 

3. The revised LIST questionnaire was used by the analysts to 
label the 1977 blind sites in North and South Dakota. Note that there 
is no significant change in accuracy compared to the analysis of 

1976 sites in North Dakota using the original LIST. 

Assessment of the LIST Characteristics . 

We examined the pattern of analyst responses to the pixel-specific 
questions of LIST in order to determine which questions have important 
discriminatory power and how accurately these questions were answered. 
Our objectives were to understand the actual workings of the current 
process, hoping thereby to be able to modify it to become more 
quantitative and possibly have more (or all) of the work done by a 
computer. 

The evaluation was based on 1976 Landsat and ancillary data from 
seven blind-site segments in Kansas. Three analyst-interpreters (AIs) 
at LARS filled out the LIST questionnaires. Their answers to all the 
questions were then keypunched to create a computer-readable data set, 
which contained the responses of 

- 3 AIs for segments 1163, 1855, 1857, 1860, 1865 

- 2 AIs for segments 1165, 1852. 

Ground truth for each labeled pixel was added to the file of AI data 
to form the basic data set, which contained information on 1359 pixels. 
Of these, 146 had ground-truth (GT) codes for "edge between wheat or 
small grains and something else." Another 11 pixels were "designated 
unidentifiable" (DU) by the analyst, probably due to haze or clouds. 

The analysts labeled an additional 15 pixels "edge;" it is not 
determinable from available ground truth whether these are in fact 
field edges or errors. This left 1187 pixels with unequivocal AI and 
GT codes: Non-agricultural; Fallow; Non-small grains; Wheat; Small 

grains . 



There is however, variability in the -meaning of the label "small 
grains," For the ground truth label, "small grains" means "small grains 
other than wheat;" for the AI label, "small grains" means "small grains 
which may be wheat but could also be oats, barley, or rye." 

Figure A-6 shows the flow of all the pixels through the LIST 
questionnaire, and Fig. A-7, A-8, and A-9 show the paths taken by pixels, 
with GT labels of "wheat/small grains," "non-small grains," and "non-ag/ 
fallow,"' respectively. 

Questions 31 and 32 form the basis of the first major branching 
point. , They ask the analyst to judge the presence and development 
stage of vegetative canopy based on the color of the pixel on each of 
two production film -converter (PFC) images — Product 1 and Product 3 
(or Kraus product) . For all 4 acquisitions the pattern of responses 
to Questions 31 and 32 were similar. Thus, either the two images gave 
almost the same information, or the AI combined his Impressions from 
both images before -answering either question. For the earliest acquisi- 
tion (which ranges from March 10 to April 18, depending on segment) 
the answers to these questions, grouped by ground truth category, are 
shoim in Table A-5. For the third acquisition the responses to Question 
31 are shown in Table A-6. 


Table A-5. Analyst. Responses to LIST Questions 31 and 32, First 
Acquisition. 


AI Response 

Question 31 

'Question 32 

Non- 

Ag 

Fallow 

Non-Sm 

Grain 

Wheat or 
Sm Grain 

Non- 

Ag 

Fallow 

Non-Sm 

Grain 

Wheat or 
Sm Grain 

No Vegetation -1 

3 

208 . 

602 

92 

3 

197 

567 

90 

Indeterminate 0 

0 

2 

54 

20 

0 

9 

70 

23 

Green 









Vegetation lj2,3 

2 

9 

85 

110 

2 

13 

104 

109 
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Figure A-6. Flow of all Pixels through LIST. 
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Figure A-7, Flow of Wheat/Sraall Grains Pixels through LIST. 
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Figure A- 8. Flow of Non- Small Grain Pixels through LIST. 
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Figure A-9. Flow of Non-Ag/Fallow Pixels through LIST. 

































Table A-6. Analyst Responses to LIST Question 31, Third Acquisition. 


I 

AI Response 

Non-Ag 

Fallow 

Non-Sm 

Grain 

Wheat or 
Sm Grain 

No vegetation 

-1 

4 

185 

364 

47 

Indeterminate 

0 

0 

3 

41 

10 

Green vegetation 

1,2,3 

1 

27 

251 

82 

Senescing/Harvested 

4,5 

0 

4 

55 

83 


Even as late as the fourth and final acquisition there were 269 "non- 
small grain" and 56 "wheat/small grain" pixels for which Question 31 was 
answered "no vegetation." Many of these pixels were misclassified as 
"fallow." When the AI answered both Questions 31 and 32 by -1 for all 
four acquisitions, he is instructed to take a path (Question 33) which 
leads to a non-crop (non-ag or fallow) label. Otherwise he follows a 
path (Question 34) which leads to a crop label (wheat, small grain, 
non-small grain) . The correct path was followed for 910 of the 1187 
pixels, as shoTm by the accompanying matrix. 




Ground 

Truth 



crop 

non- 

crop 

AI 

crop 

770 

84 

Decision 

non- 

crop 

193 

140 


963 


224 


854 

333 


Thus we have: 


P(decide "crop" [ "crop") = .800 
P(decide "crop" | "non-crop") = .375 
P(decide "non-crop" | "non-crop") = .625 
P (decide "non-crop" [ "crop") = .200. 
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The largest percentage error comes’ from calling a pixel "crop" when in 
fact it is non-ag or fallow. The largest absolute error (193 pixels) 
comes from calling a pixel "non-ag" or "fallow" when in fact it is a 
"crop." Question 33 was answered for 333 pixels — all those that got 
straight -I's on Questions 31 and 32. Of these, only 140 deserved 
straight -I's, so- the answer to Question 33 was bound to be "wrong" for 
the other 193. For the 140 on which the analyst had a chance to be 
right, he was correct for all of them. 

The next question that leads to a parting of ways is 39 : 

Does pixel follow a small grains, spectral development pattern? 

If answered "yes" this leads to "wheat" or "small grain" choices. If 
answered "no" it leads to "non-small grain." (There is a possible loop 
back to re-evaluate, but the loop was never taken.) This question was 
answered for 854 pixels, 770 of which really were "crop" and 84 were 
"non-crop." If we look at all responses from the standpoint of interest 
onlv -fr, "small grain" versus "every thing- else" we have: 


Ground Truth 

every- 

small thing 

grain else 

small 
grain 
AI 

Decision 

other 
crops 



287 


567 


193 


656 
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Remember that the "everything else" category includes 84 pixels that are 
not any kind of crop; therefore, all of these must be misclassified at 
this stage, as either "small grain" or "other crops." A more detailed 
breakdown is shown in Table A-7. 


Table A-7. Results of Analyst Response to Question 39. 

Ground Truth 




Small 

Other 

Non- 



Grain 

Crops 

Crop 


Small 

Grain 

147 

127 

13 

AI 





Decision 

Other 

Crops 

51 

445 

71 

Pixels for which ques- 
tion was not answered 

24 

169 

140 


287 


567 


333 


222 


741 


224 1187 


Which table is more useful depends on what it is important to identify 
correctly — if a pixel is not wheat, we may or may not care what it 
really is. We have: 


P (deciding "small grain"' it really is small grain and Q39 was reached) 


147 

198 


.74 


P (deciding "small grain" it really is small grain) 


147 

222 


. 66 


The second probability is lower because of the 24 small grain pixels that 
were earlier classified as "non-crop." 

If Question 39 was answered "no," the AI proceeds to Questions 42, 
43, and 44. If Question 44 is reached and answered "yes," the AI is 
directed to "go to Question 39 and re-evaluate." This path was never 
followed -- question 44 was reached for 59 pixels, but was answered "no" 



for all of them. All other paths in this section lead to a "non-small 
grains" label (which means "crop other than small grain") . Thus, 
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P (deciding "non-small grain" | it really is "non-small grain" 

and Q39 was reached) 


445 

572 


.78 


■ P (deciding "non-small grains" it really is "non-small grains") 


445 

741 


.60 


If question 39 was answered "yes," the AI goes to a series of questions 
designed to decide whether the pixel can be determined to be specifically 
"wheat" or must be left with the more general "small grains" label. The 
key question here is 41: 


Does all available data indicate wheat is the only small 
grain in this area? 


Of 287 times this question was reached, it was answered "yes" 244 times 
and those pixels were subsequently labeled "wheat." The remaining 43 
pixels led on to Questions 45 and 46 and were then labeled "small grains." 
Questions 47 to 51 were never reached. 


Table A-8. Results of Analyst Response to Question 41. 

Ground Truth 

small 

wheat grain other 


wheat 

128 

3 

113 

244 

AI 





Decision 





small 

grain 

16 

0 

27 

43 

Pixels for which ques- 
tion was not answered 

72 

3 

825 

900 


216 


6 


965 
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Looking at the overall labeling results, the AI and ground truth 
never agreed on a "small grains" pixel but since there were only 6 real 
"small grains" other than "wheat" pixels, this is not surprising. (Note 
that the AI category "small grains" includes those pixels that may be 
wheat but which are in areas where wheat can't be distinguished from 
other small grains. The GT category "small grains" contains only "non- 
wheat small grains.") 


If we redefine "small grains" to include wheat for both AI and 
ground truth we have the results shown in Table A- 9. 


Table A- 9. Two-Way Comparison of Analyst 
Labels and Ground Truth. 

Ground Truth 


every- 

small thing 

grain else 


AI 

Decision 


small 

grain 


every- 

thing 

else 


147 

140 

75 

825 


This gives: 


P (correct decision = 


147 + 825 
1187 


= .82 


P (decide "small grain" it really is "small-grain") 


147 

222 


= . 66 


P (decide "small grain" it really is something else) 


= WO 

965 T 


.15 


P (decide "something-els e" |it really is "small grain") 


75 

222 


.34 




-3S' 
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P (decide "somethlng-else" | it really is "flomething-else") 


825 
“ 965 


.85 


The actual percentage of labeled pixels that are "small grain" is 
100(222/1187) = 18.7%. However, the AI decided "small grain" fpr 
100(287/1187) = 24.2% of the pixels, thus overestimating the proportion 
of "small grains" by a factor of 

24.2/18.7 = 1.3 

Note that the proportion of correct decisions depends on the definition 
of "correct." Above we used a two-way criterion: small grains vs. 

everything else. A three-way criterion — small grains vs. non-small 
grain crops vs. non-crops — gives a lower proportion of correctness," 
as shown in Table A-10. 

Table A-10. Three-Way Discrimination 

Ground Truth 

small non-sm non- 
grain grain crop 


small 

grain 


AI non-sm 

Decision grain 


non- 

crop 


147 

127 

13 . 

51 

445 

71 

24 

169 

140 


P (correct decision) 


147 + 445 + 140 
1187 


.62 


Recall that a key point in the decision making process comes at 
Question 39, where "small grains" and "non-small grains" paths divide. 




The answer given to , this question depends heavily on the answers that 
were given to Questions 34, 35, 37, and 38. Questions 34 and 35 are 
parallel questions that ask: 

Is the vegetation indication of the pixel on PFC Product’ 1 
(Product 3) valid for the Robertson biostage of wheat for 
the acquisition? 

The pattern of response to each question is similar, so we will look 
only at Question 34. This question is answered separately for each 
acquisition, either "yes," "no," or "indeterminate." Let us for each 
pixel count the number of times the question was answered "yes." This 
can range from 4 -(since there are 4 acquisitions) down to 0 (answered 
"no" or "Indeterminate" for each date) . Note that 1 could represent 
response patterns YNNN, NYNI, INYI, etc., but in each case the question 
is answered "yes" for exactly one of the four dates. Similarly, categories 
2 and 3 contain many possible patterns of response. 


Table A-11. Pattern of Responses to Question 34 



Ground Truth 



AI Label 


•No. of "yes" 
answers 

Sm 

Grain 

Non-sm 

Grain 

Non- 

Crop 


Sm 

Grain 

Non-sm 

Grain 

Non- 

Crop 

0 

3 

49 

19 


8 

63 

0 

1 

21 

198 

32 


15 

236 

0 

2 

- 39 

154 

19 


41 

171 

0 

3 

47 

103 

10 


77 

83 

0 

4 

88 

68 

4 


146 

14 

0 


There were 160 pixels for which Question 34 was answered YYYY,' 88 of 
these were really small grain, 68 non-small grain, and 4 were non-crop. 
However, the AI labeled 146 of them small grain, and 14 non-small grain. 
(He could not label any' of them non-crop since that path had branched off 
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earlier.) Thus the AI decision is highly correlated with the answer to 
this question. The likelihood that a pixel is really small grain shows 
moderate correlation to the answer, but there are 21 small grain pixels 
that received only one "yes." Also, 72 non-small grain or non-crop 
pixels received four "yes" responses and 113 received three "yes" 
responses; many of these were later raisclasslf led as "small grains." 

Question 37 asks: 


Is the green number of pixel within the range for small grains? 


It too is answered once for each acquisition, and we use the same 
technique as above to display the results in Table A-12. 


Table A-12. Pattern of Responses to Question 37 



Ground Truth 

No. of "yes" 
answers 

Sm 

Grain 

Non-sm 

Grain 

Non- 
' Crop 

0 

4 

60 

14 

1 

11 

102 

29 

2 

44 

142 

24 

3 

■ 56 

234 

14 

4 

83 

34 

3 


AI Label 

Sm 

Non-sm 

Non- 

Grain 

Grain 

Crop 

1 

17 

0 

16 

126 

0 

53 

157 

0 

106 

198 

0 

111 

9 

0 


Question 37 was answered YYYY for 120 pixels, 83 of which were really 
"small grains." So the "typical small grain green number pattern" is 
followed by some non-small grain pixels and is not followed by some 
small grain pixels. There were 59 small grain pixels that had 0, 1, or 
2 "yes" answers, and 285 "everything else” pixels with 3 or 4 "yes" 
answers. Once again, the AI labels are correlated highly with the answer 
to the question. 




Question 38 asks: 


Does the trajectory plot of this pixel match a small grains 
trajectory plot? 

Since the plot incorporated information from all four acquisitions, the 
question is answered only once for each pixel. The results are suimmarized 
in Table A-13. 


Table A-13. Results of Analyst Response to Question 38 



Ground Truth 


Sm 

Non-sm 

Non- 


Grain 

Grain 

Crop 

No 

70 

460 

67 

Ind. 

18 

53 

6 

Yes 

110 

59 

11 


AI Label 

Sm 

Non-sm 

Non- 

Grain 

Grain 

Crop 

61 

536 

0 

55 

22 

0 

171 

9 

0 


Again, the AI label agrees closely with the answer to the question. The 
actual situation is less neatly defined. There were 180 "yes" answers', 
171 of which were labeled "small grains," but only 110 of which were 
really "small grains." 

Conclusions . In the LIST questionnaire, the decision between "crop" 
and "non-crop" comes early, and is based primarily on the appearance 
(color) of the pixel on the PFCs. Out of a total of 1187 pixels analyzed, 
the AIs followed the path leading to "non-crop" labels 333 times, but 
only 140 of these pixels actually were "fallow" or "non-agricultural. " 
(However, the 1976 drought in Kansas may have made some planted fields 
appear fallow. Some fields may have been plowed under after being 
planted. We do not know how the "ground truth" labeled such fields.) 

If the "ground truth" reflects the actual condition of the fields during 
the growing season, then the low accuracy of the "crop" vs. "non-crop" 





decision suggests that more than just the available information used by 
the LIST is required to make this discrimination accurately. The present 
decision criterion (spectral image interpretation) requires human capabil- 
ities and is not directly amenable to machine implementation. Since the 
color images were created from digitized data, it would be desirable to 
find an alternative method of structuring and categorizing the data in 
a multitemporal, computer-oriented form. 

After careful study of the LIST process, it is not surprising that 
the analyst decision (Question 39) between "small grains/wheat" and 
"other crops" is highly influenced by and correlated with the answers to 
Questions 34, 35, 37, and 38. Any improvement in methods (e.g., quanti- 
tative aids) for judging green numbers or assessing trajectory plots 
should. lead to more accurate decisions. Questions beyond 39 had little 
pixel- dependent discriminatory power for the data set used in this analysis 
(There was segment-dependent discrimination between "wheat" and "small 
grains" in Questions 40 and 41.) The next section of the report will 
discuss machine- implemented methods of dealing with Questions 34-38. 

Summary . 

We draw together here the key observations concerning our analysis 
of the LIST process. 

The training sample labeling process systematized in the LIST method 
is still very subjective. It is not surprising, then, that prior 
knowledge of the wheat-growing process is felt to be a considerable asset 
to the analyst, permitting him/her a more insightful understanding of 
the questions and a better ability to recognize abnormal situations in 
the data. Such analysts would also be more likely to provide effective 
guidance with respect to further improvement of the LIST process. 

The length and tedium of the process are clearly problematical. The 
analysts felt that certain portions of the process could be automated to 
alleviate this situation, although they added that the level of subjec- 



tivity dictates against total automation. We shall show later, however, 
that accuracies at least comparable to those of the participating analysts 
may be obtainable by appropriate quantification of key features used in 
the LIST process. 

It was interesting to discover that the major differences in LIST 
labeling results were attributable to segment variability. Analyst and 
segment /analyst interaction effects were statistically insignificant. 

Also, no significant performance difference was observed when the question- 
naire was revised based on analyst recommendations. The full implications 
of these observations should be further explored. However, one conclusion 
which may be inferred is that the quality of the classification results, 
known to be sensitive to the quality of the training sample labeling, 
depends more on segment-to-segment variations of the data than on the 
analyst selected to perform the labeling. Efforts to stratify the data 
may still pay dividends, therefore, especially when one attempts to 
automate the labeling process using methods based on parameterization of 
data characteristics. 

As 1ft now stands, the LIST process depends heavily on the ability 
of the AI to quantify the spectral response of the pixels to be labeled 
and effectively compare the spectral response to some rather loosely 
defined standards for discriminating wheat from nonwheat. Basically, 
he/she is expected to do a very quantitative job using tools which at 
best are only quasi-quantitatlve. The impact of this situation is 
reflected in the. level of analyst-dependent variability in the results 
for any given segment (not withstanding that we have already shown that 
.this variability is relatively insignificant as compared to the variability 
resulting from segment-to-segment variation in the data) . There is 
clearly room for improvement in the process through development of more 
quantitative tools. 
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2.2 Toward Computer Implementation of a LIST-like Labeling Process 

A recent technical report by Abotteen and Pore [4] discussed a 
method to automate a portion of the LIST questionnaire. In that report 
a revised LIST questionnaire incorporating the automation was presentee 
and evaluated over two LACIE spring wheat segments in North Dakota'. 

The high classification accuracies reported by Abotteen and Pore for 
these two segments encouraged further investigation into their method 
and evaluation of it over additional LACIE segments. We anticipated 
that such an investigation would point to a still more quantitative and 
"automatic" implementation of a LIST-like labeling process. 

Approach 


Questions 34 and 35 from the LIST questionnaire- ask: 

34. Is the vegetation indication of the pixel on PFC Product 1 
valid for the Robertson blostage of wheat for the 
acquisition? 

35. Is the vegetation indication of the pixel on PFC Kraus 
product valid for the Robertson biostage of wheat for 
the acquisition? 

The vegetation indication mentioned in Questions 34 and 35 is the 
response to the LIST Questions 31 and 32, respectively (see Appendix A~l) . 
These responses are coded evaluations of the nature of vegetation canopy 
indicated to the analyst by the Product 1 image (Question 31) and the 
Kraus product image (Question 32). 

Abotteen and Pore describe a rule for answering Questions 34 and 35 
based on the analyst response to Questions 31 and 32. They combined 
Question 31 with 32 and Question 34 with 35, but their report is not 
clear as to how this combination was done. Because of this ambiguity, 
the implementation described here answers Questions 34 and 35 separately. 



A-42 


The vegetation canopy indication code used by Abotteen and Pore in 
Questions 31 and 32 is slightly different from the code used by the 
analysts in this study. Abotteen and Pore use code 0 to indicate "no 
vegetation canopy, " whereas this study used code 0 to Indicate 
"indeterminate," and code -1 to indicate "no vegetation canopy." This 
Is accounted for here by assuming analyst responses -1 and 0 to both be 
equivalent to code 0 of Abotteen and Pore. 

Except for minor modifications mentioned, our computer implementa- 
tion of Questions 34 and 35 followed Abotteen and Pore exactly*! 


...Figure A-10 describes the automation technique. It is 
a chart of the Robertson biostage on the horizontal axis 
versus the vegetation canopy (Question 31/32) on the vertical 
axis. ...For each acquisition, a point is located in Figure 
A-10 with the horizontal axis coordinate corresponding to 
the Robertson biostage for wheat for the acquisition and 
the vertical axis coordinate corresponding to the answer (for 
a given pixel) to Question 31/32. If the point is in the 
blank or the dotted area. Question 34/35 is automatically 
answered with a yes. If the point is in the shaded (barred) 
area, the answer is no (for that pixel and acquisition) . 
Vertical borders belong to the class on the left. [4] 


Question 39 from the LIST questionnaire asks: 


39. Does pixel follow a small grains spectral development 
pattern? 

Again referring to Figure A-10, Abotteen and Pore suggest the 
following rule for answering the question: 


Question 39 is answered with a yes if the points corre- 
sponding to the four acquisitions are all in the blank or 
dotted regions [with] at least two in the blank region. 
...Hence, the dotted region in Figure A-10... is used as a 
different designation from the blank region... for answering 
Question 39 only. [4] 


* The inset material is taken directly from [4] except that figures 
numbers are adapted to this report. 



Vegetation Canopy 
Indication 


harvested 


turning 


high density 


medium density 


low density 


indeterminate 


no vegetation 


5 


4 


3 


2 


1 


0 


-1 


1 



.0 2.0 3.0 4.0 5.0 6.0 7.0 


Robertson Biostage 


Figure A-10. Expected Vegetation Canopy as a Function of Robertson Biostage. 
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An alternative rule for answering Question 39 is discussed later. 

The computed results corresponding to Questions 34, 35, and 39 
depend directly on the analyst responses to Questions 31 and 32 (i.e., 
on the perceived vegetation canopy indication) . This dependence poses 
problems when further automation of the process is considered. Also, 
since the answers to Questions 31 and 32 depend on the analyst's subjec- 
tive interpretation of the imagery, they are not necessarily consistent 
from analyst to analyst. 

LIST Question 37 asks: 

37 , Is the green number of the pixel within the range for 
small grains? 

Abotteen and Pore suggest that a green number grand mean and standard 
deviation for small grains be used at each Robertson biostage to 
determine a standard range for small grains. Our implementation of this 
idea answers Question 37 "yes" if the green number lies within one 
grand standard deviation of the mean; the response is "indeterminate" 
if the green number lies between one and two grand deviations from 
the mean and "no" if two standard deviations or beyond. But if an 
overall green number mean and standard deviation must be calculated, 
what samples should be used for the calculation? Abotteen and Pore 
used 34 unspecified LACIE segments to calculate a green number mean and 
standard deviation for all winter small grains and spring small grains. 
However, one might expect that more accuracy would be obtained if 
separate green number means and standard deviations were used for spring 
and/or winter small grains in certain geographical areas (such as 
universal strata) . 

Abotteen and Pore introduce what they term the PCG (Principal 
Component Greenness) statistic as another feature for separating small 
grains from non-smaU grains. It is calculated by taking the inner 
product of the first greenness Inuige eigenvector (see Abotteen [7]) 
with the green number vector for the pixel under consideration. (Tiie 
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green number vector for a pixel is defined by G = ^3* ^4^ ’ 

where g^ is the green number for the pixel at the ith acquisition.) 

The first greenness image eigenvector plotted versus Robertson 
biostage has a shape similar to an "ideal" small grains temporal trajec- 
tory. The PCG statistic is therefore an appropriate feature to use in 
answering LIST Question 38: 

38. Does the trajectory plot of this pixel ‘match a small 
grains trajectory plot? 

This PCG statistic is, however, influenced by the size of the 
elements of the green number vector. For example, given G = [24, 45, 

10, 43]'^ and the first greenness image eigenvector = [.59, .69, -.53, 
-.25] , the PCG statistic is 39.8, a rather large value; however, the 
green numbers definitely do not follow a typical small grains temporal 
trajectory, and the -ground truth label for this pixel is "non-small 
grains." Because of this problem it was decided to normalize the PCG 
statistic by dividing by the 2-norm of the green number vector and- 
multiplying by 40 (to maintain a convenient magnitude) . See Table A-14 
for further examples. This normalization does not always reduce the 
PCG statistic for no.n-small grain pixels (and vice versa), but it does 
guarantee that the PCG statistic is uninfluenced by green number size 
and is thus a measure of trend only. The implementation described 
herein uses the normalized PCG statistic. 

Table A-14. Comparison of Unnormalized and Normalized PCG Statistic. 


G 

Unnormalized 
PCG Statistic ■ 

Normalized 
PCG Statistic 

Ground 

Truth 

[24, 45, 10, 43]^ 

39.8 

23.6 

Non- Small Grains 

[2, 11, 6, if 

11.7 

36.8 

Winter Wheat 

[4, 6, 6, 4]’’ 

8. .7 

34.0 

Winter Wheat 

[4, 18, 24, 13]^ 

32.6 

39.2 

Winter Wheat 

[3, 17, 18, 12]'^ 

20.0 

29.0 

Non-Small Grains 
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Abotteen and Pore used four acquisitions each of seven unspecified 
winter small grains LACIE segments acquired in the 1976 crop year to 
calculate the components of the first greenness image eigenvector for 
various Robertson biostages. Six unspecified LACIE segments were used 
in the calculation for spring small grains. As with the calculation of 
green number mean and standard deviation, we speculate, based on analyst 
experience,, that more accuracy might be obtained if separate calculations 
were performed for differing geographical areas. 

Another problem with the implementation of Question 38 is related 
to how large the normalized PCG statistic should be before the answer 
is given as "yes." As discussed in the next section, a threshold value 
near 25 seems to be appropriate for the LACIE segments considered; but 
it is not known whether this value would be universally applicable. 

Discussed earlier was the Abotteen and Pore rule for answering 
LIST Question 39: 

39. Does pixel follow a small grains spectral development 
pattern? 

An alternative rule can easily be devised based on the computed response 
to Questions 37 and 38: Answer Question 39 "yes" if the answer to 

Question 38 is "yes" and Question 37 is answered "yes" or "indeterminate" 
for all acquisitions, or Question 37 is "no" for only one acquisition 
and "yes" for the three remaining acquisitions. Both rules were implemented. 

Experimental Results . 

LIST Questions 34, 35, 37, 38 and 39 were implemented using a 
simple FORTRAN program. The program was run on the seven 1976 LACIE 
segments in Kansas which had been labeled earlier by three analysts. 

The rule for answering Question 39 based on Questions 37 and 38 was 
used for the evaluation discussed here. The other rule (based on 34 
and 35) would really only evaluate the analyst responses to Questions 
31 and 32. 



The Abotteen and Pore values for the first greenness image eigenvector 
and for green number mean and standard deviation were used. Green 
number means and standard deviations were also calculated based on the 
seven LACIE segments considered, but use of those means and standard 
deviations did not significantly affect the labeling results. Of course, 
when the green number means and standard deviations are calculated from 
only the segment being labeled, the accuracy is increased (e.g., segment 
1960 accuracy was increased from 87.9% to 94.8%) ^ but that's cheating, 
since this would be testing accuracy on the training segment. 

A convenient method by which to evaluate the effectiveness of the 
implementation is to interpret a positive response to Question, 39 as 
labeling the pixel "wheat" (or "small grains") and a negative response 
as labeling the pixel "honwheat" (or "non-small grains"), and to compute 
the resulting accuracy based on ground truth. Table A-15 lists the 
labeling accuracies obtained using the computed answers to LIST Question 
39 and compares this to the accuracies obtained by the three analysts. 

The mean- accuracy for the computer labeling was higher than that for . 
any one analyst. Also, the standard deviation of the computer labeling 
accuracy was somewhat lower than every analyst standard deviation. 

Further, looking at each individual segment accuracy, the computer "won" 
eleven times while the analysts "won" ten times. Thus the computer' 
labeling, based solely on green numbers, was consistently at least as 
•accurate as labeling by analysts who had access to much more Information 
than just green numbers. 
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Table A-15. Comparison of Analyst and Computer Labeling Accuracies. 


Segment 

Labeling Accuracies (%) 

Analyst 

Computer 
Threshold = 25 

AI 

A2 

A3 

1163 

84.5 

■87.9 

76.9 

91.2 

1165 

98.4 

100 

79.7 

95.3 

1852. 

89.7 

92.4 

89.4 

84.8 

1855 

59.4 

82.8 

82.8 

78.1 

1857 

85.2 

78.7 

78.5 

77.0 

1860 

65.5 

62.1 

66.1 

87.9 

1865 

82.4 

85.3 

69.4 

89.0 

mean 

80.7 

84.2 

77.5 

86.2 

std. dev. 

13.6 

11.9 

7.9 

6.7 


The results in Table A-15 were obtained using a threshold of 25 In 
answering Question 38 (and thus Question 39). Thresholds of 20 and 30 
were also tried. The threshold of 25 was chosen, not because it gave a 
significantly more accurate labeling in the sense of correctly labeled 
pixels versus the total number of pixels (which it didn't), but because 
it tended to give a more accurate estimate of the total number of small 
grain pixels in the segment. 

Discussion and Conclusions . 

In Section 2.1, we demonstrated the impact of Questions 31, 32 and 
34 through 38 on the results of the LIST process. These questions 
direct the AI to a close scrutiny of the spectral response of pixels 
already determined to be vegetation. They attempt to provide the AI 
with objective evidence on which to base the crucial decision at 
Question 39 which, for all practical purposes, implements the discrimin- 
ation between small grains and non-small grains. Available experimental 
results suggest the following conclusions: 
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1. Given the vegetation canopy indication (AI response to Questions 
31 and 32) and the Robertson biostage (Question 16) it is possible to 
"compute" (essentially by table look-up) the response to Questions 34 and 
35 (valid vegetation canopy indication) . Furthermore, the response to 
Question 39 can be computed based on the outcome of Questions. 31, 32, 34 
and 35. 

2. The decisions called for in Questions 37 (green number for 
small grains) and 38 (small grains trajectory plot) are quantifiable 
provided that the necessary statistics are available and Invariant over 
time and location. The response to Question 39 can then be computed 
based on Questions 37 and 38. Our experimental results support the 
possibility of obtaining the necessary statistics. Labeling results 
based on only the normalized inner product of the green number (temporal) 
vector and the first greenness image eigenvector rivaled those obtained 
by AIs using the complete LIST process. 

The latter conclusion is particularly important because it suggests 
that the small grains/non-sraall grains determination can be made by 
machine computation just as accurately and much more efficiently than by 
the AI using the LIST questionnaire. This could be used to advantage to 
greatly reduce the tedltim of the AIs task, although the AI may still be 
employed to monitor the results for anomalous cases and,' as necessary, 
to discriminate wheat from other small grains. 

Since the computations involved in this automated decision process 
are very simple, one could easily conceive of applying them to the entire 
segment. A map of the results (e.g., a PFC image), possibly a color- 
coded rendering of the normalized inner product of green number vector 
and greenness image eigenvector, would likely be of great assistance to 
the AI in the labeling process. 



3, SUMMARY AND RECOMMENDATIONS 


The LIST method for labeling training data (in lieu of ground observa- 
tions) represents a workable approach, though it is generally recognized 
that substantial improvements are needed and possible. In particular, 
more effective use can be made of the analyst-interpreter by making the 
questionnaire shorter and less tedious to apply. Some of the questions 
need to be made more objective and additional quantitative aids should be 
provided to the AI. A temporal greenness trajectory function has been 
proposed and, in this study, shown to provide a means of objectively 
discriminating between the categories "small grains" and "non-small 
grains." Experimental results obtained by Lockheed Electronics Corp. 
and LARS suggest that it may be feasible to automate this discrimination. 

It should be pointed out, however, that the LIST process calls for 
data from strategically timed acquisitions of the multispectral data, and 
the data base used in this study was selected to meet this requirement. 

The impact of poorly timed or missing acquisitions has not been assessed, 
but it is likely to be significant. Further research is required to find 
ways of minimizing this impact. 


Finally, we note that the LIST process is, after all, a method for 
unsupervised classification — classification of the primary remote sensing 
data without benefit of "ground truth" for definition of training samples. 
As such it cannot be as powerful for achieving accurate discrimination as 
a supervised method would be (the latter makes more definitive associations 
between information classes and corresponding regions of the measurement 
space) . This must be kept in mind as efforts are made to extend the 
approach in the direction of Increasingly difficult discriminations. It 
may eventually become necessary to consider alternative strategies 
employing more direct information about the ground scene (such as aerial 
photography), thereby reintroducing a greater degree of supervision in the 
classifier training process. The alternative Is to continue the search 
Cor hlglily characteristic and Invariant spectral/spatial/ temporal features 
or "signatures." The Inherent variability ("noisiness") of the natural 
scene makes progress in this direction increasingly difficult. 
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Appendix A-1 

List Experiment Questions [4] 


1. Segment // 

2. Partition // 

3. Segment Type 

(winter wheat, spring wheat, mixed wheat) 

4 . Country 

5. State 

6 . County 

Segment Questions from Imagery 


7. Is there any agricultural land present in this segment? 

(Check full frame, multitemporal imagery, maps and previous year's 
Imagery) . 

Yes: Go to 8 

No: Stop 

8. List the interpretable acquisition dates in the space provided (YDDD) . 

9. Acquisition date chosen by analyst as registration date is ________ 

Indicate (a) YDDD and (b) blowindow. (This is not necessarily the 
Goddard reference segment.) 

10. Is the segment representative of the general area? (Check full 
frame and ancillary data) 

Yes 

Inde t ermina t e 
No 

11. Are there strip fields in the cultivated area? 

Yes 

No 


Cropping Practices 

12. Are wheat and/or other small grains continuously cropped in this area? 
Yes 

Indeterminate 

No 
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13. Is fallowing practiced in this area? 

Yes 

Indeterminate 

No 

14. Are the small grains irrigated in this area? 

Yes 

Indeterminate 

No 

15. Determination of potential confusion: 

a. List the most recent percent of county area occupied by each of 
the applicable major crops. 

I 

b. Using the nominal crop calendar, determine the possibility of 
confusion between wheat (winter and/or spring) and the other 
major crops for each acquisition. 

+1 = No confusion 
0 =, Indeterminate 
-1 = ' Confusion 


Met Data 


16. Robertson biostage for the segment for each acquisition is 

17. Total precipitation (in inches) for the week prior to each acquisition 

as provided in the weekly meteorological summary is 

18. Total precipitation (in inches) for the 3 days prior to each 

acquisition is • 

19. Is there evidence of drought conditions (from met summary)? 

20. Is there evidence of winter kill (from met summary)? 

Yes 

Indeteminate 

No 

21. ■ Is there evidence of a late freeze (from met summary)? 

Yes 

Indeterminate 
No 

Is there evidence of hall damage (from met summary)? 

Yes 

Indeterminate 
No 


22 . 
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23. Is there evidence of insects or disease (from met summary)? 
Yes 

Indeterminate 

No 

24. Expected normal yield for this segment is . 


25. Evaluation of crop condition for each acquisition is 

2 = Significantly above normal 
1 = Above normal 
0 = Near normal 
-1 = Below normal 

-2 = Significantly below normal 
************ 

Delineate "DO" areas. (Area must apply to all acquisitions.) 
Delineate "DU" areas where applicable. 

************ 


Pixel Specific Questions 

26. Is pixel a DO’d pixel? 

Yes : Non-ag STOP 

No: Go to 27 

27. Is pixel a DU'd pixel? 

Yes: STOP 

No : Go to 28 

28. Is pixel registered with regard to analyst .chosen registration date? 

Yes 

Indeterminate: Do not answer questions 36, 37, 38 for this acquisition. 

No: Do not answer questions 36, 37, 38 for this acquisition 

Go to 29 

29. Is pixel a mixed pixel (part of more than one field or boundary)? 

Yes 

Indeterminate 
No 

Go to 30 

Is this an anomalous pixel (not representative of most of the other 
pixels within the field)? 

Yes 
No 

Go to 31 


30 . 
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31. PFC vegetation canopy indication is . (Product 1) 

-1 = No vegetation canopy 

0 = Indeterminate 

1 = Low density green vegetation canopy 

2 = Medium .density green vegetation canopy 

3 = High density vegetation canopy 

4 = Senescing (turning) vegetation canopy 

5 = Harvested canopy (stubble) 

32. PFC vegetation indication is ^ . (Kraus product) 

Same code as # 31. If -1 on 31 and 32 then go to 33. Otherwise go 
to 34. 

33. Is pixel a non-ag pixel? (Check all available data.) 

Yes : Non-ag STOP 

No: Fallow STOP 

34. Is the vegetation indication of the pixel on PFC Product 1 valid for 
the Robertson biostage of wheat for the acquisition? (Check keys for 
partition. ) 

Yes 

Indeterminable 

No 

35. Is the vegetation indication of the pixel on PFC Kraus product valid 
for the Robertson biostage of wheat for the acquisition? 

Yes 

indeterminable 

No 

36. Green number of pixel is . . (Refer to question 28. 

Correct the number to 60° latitude if appropriate.) 

37. Is the green number of the pixel within the range for small grains? 
(Check green number/biostage chart.) 

Yes 

Indeterminable 

No 

38. Does the trajectory plot of this pixel match' a small grains trajectory 
plot? (Answer for fourth acquisition only.) 

Yes 

Ind eterminable 
No 

Does pixel follow a small grains spectral development pattern? 

Yes: Go to 40 

No: Go to 42 


39 . 
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40. Does crop statistical data indicate wheat is the only small grain 
in this area? 

Yes: Go to 41 

No: Go to 45 

41. Does all available data indicate wheat is the only small grain in this 
area? 

Yes: Wheat Go to 52 

No: Go to 45 

42. Does crop statistical data Indicate significant occurrence of other 
crop types? 

Yes: Go to 43 

No; Go to 44 

43. If more than one non-small grain spectral signature is observed, do 
the proportions of the signatures correspond to the historical 
non-small grain percentages? 

Yes: Non-small grains STOP 

Indeterminate 

No: Go to 44 

44. Do ancillary and met data indicate that the departure of the observed 

spectral signature from' an expected normal small grains spectral 
signature could be due to an abnormal small grains signature development? 
Yes: Go to 39 and re-evaluate 

No: Non-wheat STOP 

45. Does the nominal crop calendar Indicate an out-of-phase relationship 
between wheat and other confusion small grains? 

Yes: Go to 46 

No: Small grains STOP 

46. Can subclasses of small grains be identified on PFC products or 
spectral plots as early, medium, or late developing? 

Yes: Go to 47 

No: Small grains STOP 

47. Does the stage of development of any of these subclasses correspond 
to the .indicated stage of development for the out-of-phase confusion 
small grain/ s? 

Yes: Go to 48 

No: Small grains STOP 

48. Do the proportional distributions of the small grain subclasses 
correspond to the historical percentage of confusion small grains? 

Yes: Wheat Go, to 52 

No; Go to 49 

49. Is the proportional distribution of the small grains subclass 
consistent with the historical percentage of wheat? 

Yes: Go to 50 

No: Small grains STOP 



A-57 


50. Is the small grains subclass a relatively pure wheat class? 
Yes; Go to 51 

No; Small grains STOP 

51. Does the pixel belong to the above mentioned subclass? 

Yes; Wheat Go to 52 

No: Small grains STOP 

52. Analyst estimate of pixel's growth stage is 



Appendix A~2 

Revised List Experiment Questions 


1. Segment # 

2. Partition # 

3. Segment T3rpe 

(winter wheat, spring wheat, mixed wheat) 

4. Country 

5. State 

6. ' . County 


Segment Questions from Imagery 

7. Is there any agricultural land present in this segment? 

(Check full frame, multitemporal imagery, maps and previous year's 
imagery.) 

Yes: Go to 8 

No: Stop 

8. List the interpretable acquisition dates in the space provided (YDDD) . 

9. Acquisition date chosen by analyst as registration date is _________ 

Indicate (a) YDDD and (b) biowindow. (This is not necessarily the 
Goddard reference segment) , 

10. Is the segment representative of the general area? (Check full frame 
and ancillary data) . 

Yes 

Indeterminate 

No 

11. Are there strip fields in the cultivated area? 

Yes 

No 


Cropping Practices 

12. Are wheat and/or other small grains continuously cropped, in this area? 
Yes 

Indeterminate 

No 
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13. Is fallowing practiced in th-*s area?- 
Yes 

Indet enainate 
No 

14. Are the small grains irrigated in this area? 

Yes 

Indet ermina t e 
No 

15. Determination of potential confusion; 

a. List the most recent percent of county area occupied by each of 
the applicable major crops. 

b. Using the nominal crop calendar, determine the possibility of 
confusion between wheat (winter and/or spring) and the other 
major crops for each acquisition. 

+1 = No confusion 
0 = Indeterminate 
-l’= Confusion 


Met Data 


16. Robertson biostage for the segment for each acquisition is 

17. Total precipitation (in inches) for the week prior to each acquisi- 
tion as provided in the weekly meteorological summary is 

18. Expected normal yield for this segment is _ 

19. Evaluation of crop condition for each acquisition is (check met 

summary) . 

2 = Significantly above normal 
1 = Above normal 
0 = Near Normal 
-1 = Below normal 
-2 = Significantly below normal 


Pixel Specific Questions 

20. Is pixel a DO'd pixel? 
Yes : DO STOP 

No: Go to 21 

Is pixel a DTJ'd pixel? 
Yes: DU STOP 

No: Go to 22 


21 . 
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22. Is pixel registered with regard to analyst chosen registration date? 

Yes: Go to 23 

Indeterminate: STOP 

No: STOP 

23. Is pixel a mixed pixel (part of more than one field or boundary)? 

Yes (on all 4) : Edge STOP 

Indeterminate: Go to 24 

No : Go to 24 

24. Green number of pixel is . 

25. Is this an anomalous pixel (not representative of most of the other 
pixels within the field) ? 

Yes 

No 

Go to 26 

26. PPG vegetation canopy indication is . (Product 1) 

-1 = No vegetation canopy 

0 = Indeterminate 

1 = Low density green vegetation canopy 

2 = Medium density green vegetation canopy 

3 = High density vegetation canopy 

4 = Senescing (turning) vegetation canopy 

5 = Harvested canopy (stubble) 

27. PPG vegetation Indication is . (Kraus Product) 

Same code as 26. If all available data indicates no vegetation, 
go to 28, Otherwise, go to 30. 

28. Is pixel a non-ag pixel? (Check all available data.) 

Yes: Non-Ag STOP 

No: Go to 29 

29. Is Question 13 affirmative? 

Yes: Pallow STOP 

No: Non-small grain STOP 

30. Is the vegetation Indication of the pixel on PPG product 1 valid for 
the Robertson biostage of wheat for the acquisition? (Check keys 
for partition.) 

Yes 

Indeterminate 
No 

Is the vegetation indication of the pixel on PPG Kraus product valid 
for the Robertson biostage of wheat for the acquisition? 

Yes 

Indeterminate 
No 


31 , 
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32. Is the green number of the pixel within the range for small grains? 
(Check green number/ bio stage chart.) 

Yes 

Indeterminate 

No 

33. Does the trajectory plot of this pixel match a small grains trajec- 
tory plot? (Answer for fourth acquisition only.) 

Yes 

Indeterminate 

No 

34. Does all available data indicate pixel follows a small grains 
development pattern? 

Yes: Go to 37 

No: Go to 35 

35. Is Question 25 affirmative? 

Yes: Go to 36 

No: Go to 38 

36. Does field around pixel follow a small grain development pattern? 
(Check available data.) 

Yes: Go to 37 

No: Go to 38 

37. Does all available data indicate wheat is the only small erain in 
this area? 

Yes: Wheat Go to 47 

No: Go to 40 

38. Does crop statistical data indicate significant occurrence of 
other crop types? 

Yes: Non- small grain STOP 

No: Go to 39 

39. Do ancillary and met data Indicate that the departure of the observed 
spectral signature from an expected normal small grains spectral 
signature could be due to an abnormal small grains signature 
development? 

Yes: Go to 34 and re-evaluate 

No: Non-wheat STOP 

40. Does the nominal crop calendar indicate an out-of-phase relationship 
between wheat and other confusion small grains? 

Yes: Go to 41 

Indeterminate: Small grains STOP 

41. Can subclasses of small grains be identified on PFC products or 
spectral plots as early, medium, or late developing-? 

Yes; Go to 42 

No; Small grains STOP 
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42. Does the stage of development of any of these subclasses correspond 
to the indicated stage of development for the out-of-phase confusion 
small grains? 

Yes: Go to 43 

No: Small grains STOP 

43. Do the proportional distributions of the small grain subclasses 

correspond to the historical percentage of confusion small grains? 
Yes: Wheat Go to 47 

No: Go to 44 

44. Is the proportional distribution of the small grains subclass con- 
sistent with the historical percentage of wheat? 

Yes: Go to 45 

No: Small grains STOP 

45. Is the small grains subclass a relatively pure wheat class? 

Yes; Go to 46 

No: Small grains STOP 

46. Does the pixel belong to the above mentioned subclass? 

Yes: Wheat Go to 47 

No; Small grains STOP 

47. Analyst estimate of pixel growth stage is , 
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B. Application and Evaluation of Landsat Training, Classification, 
and Area Estimation Procedures for Crop Inventory* 

The need for accurate and timely crop production information on a 
global basis is increasing each year as the world's growing population 
increases the demand for food. In mid-1972, the world food situation 
changed as production declined for the first time in many years at a 
time of rapidly increasing demand. The importance of crop production 
information has been recently highlighted by severe drought in the 
Soviet Union causing large purchases of wheat and increased grain 
exports by the U.S. to all parts of the world. 

Considerable evidence has developed that multispectral remote 
sensing from satellites combined with computer-aided data analysis 
can provide the data necessary for upgrading our capability to monitor 
and inventory the world's croplands. The first milestone in the 
development of the technology was collection in 1964 of multispectral 
photography for the first time over agricultural fields and recognition 
of the potential of the multispectral approach for crop identification [4]'. 

In 1967 a crop classification was- made of multispectral scanner data using 
pattern recognition methods implemented on a digital computer [5]. The 
Corn Blight Watch Experiment, conducted in 1971 over seven Com Belt 
States, provided a prototype remote sensing system which successfully 
integrated techniques of sampling, data acquisition, processing, analysis, 
and information dissemination in a quasi-operational system environment [10^. 
Multivariate pattern recognition methods implemented on a digital computer 
were used to classify Landsat— 1 data acquired over a three— county area in 
northern Illinois and the area estimates obtained for corn and soybeans 
were within 1.5 and 1.1 percent, respectively, of those made by the 
U.S. Department of Agriculture [2]. 


This report, describing the work of Task 2B, Application and Evaluation 
of Landsat Training, Classification, and Area Estimation Procedures 
for Crop inventory, was written by Marilyn Hixson. 
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Based on these and other results , the Large Area Crop Inventory 
Experiment (LACIE) was initiated in 1974, using the remote sensing 
technology available at that time, to estimate wheat production at the 
country level [93 • In LACIE, training and classification were typically 
performed on each 5x6 nm segment of Landsat MSS data. Large area 
estimates for wheat were made by aggregating the proportion of wheat 
in the individual segments, which together represented about two 
percent of the total land area. Since the estimates were based on a 
relatively small number of segments, the sampling errors associated with 
estimates were quite large Otore than 4% at the country level) - 

Since the LACIE system was designed, new information has been 
acqttired on scene stratificarion, training sample selection, classifi- 
cation algorithms , and area estimation methods . This research task 
will build upon these recent developments to Improve future crop inventory 
systems. 

In particular, three classification algorithms developed at LARS, 

ECHO (Extraction and Classification of Homogeneous Objects), cascade 
classifier, and layered classifier, will be tested [6,8j, The layered 
and cascade classifiers are both multistage classifiers which eliminate 
many of the difficulties encountered witbnthe "stacked vector" or 
"concatenation" approach to multitemporal analysis. ECHO can also be 
used in an unsupervised mode as a training aid. Past studies have 
investigated different sampling schemes for training and different area 
estimation procedures [l,7]. 

This investigation is divided into two phases: a preliminary 

study and a major study. The objectives of the major study include 
investigations of training area selection and training, classification, 
and area estimation procedures. Specific objectives are: 

1. Training 

To evaluate and extend procedures for the training area selection 
including factors such as size, number, and geographic 
location of the training areas. 
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To refine procedures for obtaining class statistics from 
multiple training areas. Training methods include ISOCLSj 
multi-block clustering, and ECHO. 

2. Classification 


To assess the accuracy of the area estimates of corn and 
soybeans obtained by different classification algorithms:' 
per point maximum likelihood, ECHO, and sum of densities. 

To assess the accuracy of multitemporal classification 
(including LACIE Procedure 1) as compared to the unitemporal 
classifications . 

3. Area Estimation 

To compare the accuracy and precision of area estimates for 
corn and soybeans obtained by different estimation methods; 
specifically, to compare estimates obtained by classification 
and aggregation of a .systematic . sample of pixels with estimates 
made from a sample segment approach. 

To compare methods of obtaining unbiased estimates such as 
stratified area estimates and the regression approach. 

At the request of NASA/ JSC, the implementation plan for this task 
was revised in mid May. This revision was to reflect the increased 
emphasis on Multicrop and permit the establishment of a supporting 
field research task. That effort,’ which was conducted as part of this 
task, is described in Volume I, Section C, of this report. . 

At the time this investigation was begun, data appropriate for 
addressing the original objectives was not available. Therefore, during 
the period of data acquisition, a preliminary study was conducted using 
currently available data. The activites’ of this task during the past 
contract year have been in three areas: 
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(1) Development of the experiment design and definition of 

data requirements for the major study. As an extension of this 
objective, a stratification and sampling plan for the 
NASA/ JSC 1978 corn/soybeans data acquisition program was defined 
and carried out by LARS. 

(2) Recommendations for reference data acquisition. Data to be 
acquired as inventory and periodic observations were recommended. 
Flightlines and dates for aerial photography acquisition were 
recommended. 

(3) Evaluation of the training-classification procedures used in 
LACIE (Procedure 1) for a corn/soybeans/"other" crop 
identification problem and investigation of changes to improve 
the performance of Procedure 1 on corn and soybeans. This 
study has been conducted using currently available data; 
results will need to be confirmed when additional sample segments 
become available. 

These three general areas of effort are addressed in this report. 
Section 1 describes the stratification and sample selection work conducted 
for the transition year experiments. The data acquisition is discussed 
in Section 2 and Section 3 describes the preliminary study. 



1. STRATIFICATION AND SAMPLE SELECTION FOR MULTI CROP EXPERIMENTS 


1.1. Introduction 

In February 1978, LARS was asked to participate in the stratifi- 
cation and sampling tasks for the transition year experiments. The 
project was supported by personnel and funds from two tasks of 
NASA Contract NAS9-15466: "Application of Statistical Pattern Recog- 

nition to Image Interpretation" and "Application and Evaluation of 
Landsat Training, Classification, and Area Estimation Procedures for 
Crop Inventory." 

The purpose of this effort was to identify the locations of the 
sample segments for the 1978-79 Multicrop experiments to support: 

- Development and evaluation of procedures for using LACIE and 
other technologies for the classification of com and soybeans. 

- Identification of factors likely to affect classification 
performance. 

- Evaluation of problems encountered and techniques which are 
applicable to the crop estimation problem in foreign countries 
as well. 

In order to meet these requirements, two types of samples were 
selected. Low density segments were distributed throughout corn and 
soybean producing areas to sample all variations of conditions which 
could affect classification accuracy and to more completely represent 
conditions which might be found in' other countries. High density 
segments were selected in smaller areas to support the investigation 
of training, classification, and area estimation procedures on a 
smaller scale for possible use in future Multicrop experiments. 

In this report, the data set and methods employed in the stratifi- 
cation are discussed. Rationale, methods, and results for both the low 
and high density segments are discussed. 
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1.2 Objectives 

In order to support the corn and soybean experiments, two types 
of segments were selected: low density segments and high density seg- 

ments. Different issues can be addressed using each type of segment. 

The low density segments were selected to cover a wide range of 
conditions under which areas will have to be classified in larger 
Multicrop efforts to allow possible problems to be examined (e.g., 
in algorithms, systems, data acquisition). The low density samples 
were located in 14 states in the U.S. corn and soybean producing areas. 
This region was divided into eight strata according to the level of 
county production of com and soybeans and average farm size. Twenty 
segments per stratum were selected. The distribution of these seg- 
ments permits the calculation of variability within a stratum to pre- 
dict the variability of aggregated estimates of com and soybeans in 
the U.S. and to. determine the optimum allocation of samples for mak- 
ing such estimates. The allocation of these samples was not designed 
for, and thus does not support, making aggregated estimates. 

The high density samples are located in four test sites in high 
production areas of the U.S. Corn Belt. Twenty segments were selected 
from each test site which is approximately ten counties in size. The 
increased density of samples permits estimation of the local variabil- 
ity in high-production areas. These samples support the investigation 
of training, classification, and area estimation procedures on a 
smaller scale for possible use in future Multicrop experiments. Other 
area estimation procedures such as regression estimation can be evaluated 
and county level estimates can be assessed. 

1.3 Data Set Description 

The data- used in this study were acquired by the Statistical 
Reporting Service of the U.S. Department of Agriculture (US'DA/SRS) . 

Two types of data were available: the USDA/SRS county estimates for 

1972-76 and the 1974 agriculture census data. The data were supplied 
by NASA/Johnson Space Center (NASA/ JSC) . 
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The SRS dual county estimates program data for 1972-76 were avail- 
able. Under the Federal program, county estimates are prepared for 
specified crops, states, and counties. These estimates include the . 
major crops produced -in most states. Some of the state statistical 
offices prepare county estimates for a few crops not required under 
the Federal program in cooperation with their respective state govern- 
ments, but these estimates were not available on tape. 

Variables which were included in the county estimate's data set 
were: state, crop reporting district, cotinty, year data was punched, 

crop year, commodity code, acres planted, acres harvested, yield per 
harvested acre, and production (Figure B-1). Counties from the entire 
United States were represented. The commodities for which information 
was available are listed in Table B-1. 

The 1974 agriculture census data vere supplied for 14 states in 
the U.S. corn and soybean producing regions. These data included: 
number of acres in each county, average farm size by county, and 
the. land in farms for each county. 

1.4 Stratification 


The first step in selection of sample segments was the stratifi- 
cation of the area to be studied. The variables used in the strati- 
fication, the rationale and methods employed, and the results of the 
stratification will be discussed in this section. 

Variables Used in Stratification . 

The variables available were those contained in the USDA/SRS 
county estimates program (Figure B-1) and the selected variables from 
the 1974 agriculture census which were supplied by NASA/JSC. The 
variables which were considered for use were: acres planted, acres 

harvested, yield, and production for the crops listed in Table B-1; 
acres in a county; percent agricultural area (land in farms) in a 
county; and average farm size by county. From these variables, the 
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Table B-1. Crops included in the USDA/SRS county estimates program. 

Winter Wheat 

Durum Wheat 

Other Spring Wheat- 

Wheat, All 

Rye , All 

Rice, All 

Com for Grain 

Com For Silage 

Oats, All 

Barley, All 

Sorghum, All 

Cotton, All 

Cotton, Upland 

Cotton, American Pima 

Tobacco 

Flaxseed 

Peanuts 

Soybeans 

Dry Edible Beans - Pea CNavy) 

- Great Northern 

- Flat Small White 

- Pinto 

- Red Kidney 

- Pink 

- Small Red 
Dry Beans (All Mich. ) 

Dry Peas - Smooth Green Kinds, All 

- Yellow and White Kinds, All 
Wrinkled Peas for Seed 
Lentils, All 
Austrian Winter Peas 
Green Peas for Processing, All 
Tomatoes for Processing, All 
Bush Garden Seed Beans (Idaho) 



number of agricultural acres in a county was computed by multiplying 
the percent agricultural area by the county acreage. Normalized pro- 
duction of a crop for a county was computed by dividing the five- 
year average production of that crop by the agricultural acres in 
the county. 

In order to fulfill the objectives, the stratification was per- 
formed using three variables: normalized .production .of com, normal- 

ized production of soybeans, and average farm size. The first two 
variables were selected to make strata which are homogeneous with 
respect to the relative importance of com and soybeans ' in the agri- 
cultural scene. The average farm size was selected to represent 
problems which might be encountered in Landsat data classifications 
with different field sizes. 

Methods of Stratification . 

The rationale for the stratification method was based upon the 
objective of creating eight strata in the United States com and 
soybean producing regions which were relatively homogeneous with 
respect to the relative importance of corn and soybeans in the agri- 
cultural scene and the average farm (or field) size. These strata, 
then, represent several conditions under which Landsat data will have 
to be classified in Multicrop studies. Samples selected from these strata 
will be representative of conditions found throughout the corn and 
soybean producing regions. 

The first step in the stratification was a reduction of the data 
set size. Only the 14 states for which the agriculture census data 
were supplied were considered. Counties with neither corn nor soybeans 
were omitted. 

The joint distributions of normalized corn and soybean productions 
and average farm size were examined. The average farm size was 
represented in two groups: small farms (average size less than or 

equal to 190 acres) and large farms (size greater than 190 acres). 
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About one- third of the counties were in the small farms category 
and about two- thirds were in the large farms category. The division 
intO' these two groups was somewhat arbitrary although there was a 
break in the continuum of data at about 190 acres. 

For each farm size, the normalized corn and soybean productions 
were displayed in deciles to look for broad clusters of data. The 
strata were determined by examining tables of the distributions of 
these variables. Three strata of small farm counties and five strata 
of large farm counties were selected to represent the two farm sizes 
approximately proportionally to the number of counties in them. 

Counties which fell in the lower 10% of all' counties in both 
com and soybean production were omitted from consideration. 

Counties which fell outside the broad clusters of data were not included 
in any stratum. Thirteen counties satisfying all other selection 
criteria were outliers from the clusters and were not included. A 
schematic diagram (Figure B-2) shows the methodology employed in the 
stratification . Table B-2 gives the definitions of stratum boundaries. 


Results of Stratification . 

Eight strata covering 14 states in the U.S. com and soybean 
producing region were determined. The counties in each of these 
strata are shown in Figures 3 to 10. Lists of the counties can be 
obtained in a complete report of this work £ 3] . 

The large farm, highest production stratum (stratum 8) is geo- 
graphically located at the center of the Com Belt. Strata 7, 6, and 
4 are located around its perimeter outward according to decreased 
production. In these strata of large farms, corn and soybeans are of 
approximately equal importance. 

Stratum 5 is located geographically apart from the other strata 
with large farms. This stratum, in which soybeans have a greater 
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NORMALIZED 

CORN 

PRODUCTION 



NORMALIZED SOYBEAN 
PRODUCTION 


NORMALIZED 

CORN 

PRODUCTION 



INCREASING DENSITY 
NORMALIZED SOYBEAN 
PRODUCTION 


Figure B-2. Schematic diagram illustrating the determination of 
strata for MulticrQP experiments based on normalized 
production of corn and soybeans and average farm size. 
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Table B-2. Determination of strata according to the normalized production 
of corn and soybeans and average farm size. 


Stratum 

Number 

Average 

Farm 

Size 

Normalized 

Com 

Production 

Soybeans 

No. of 
Counties 


(acres) 

Cdeciles) 

(deciles) 


1 

<190 

0-40 

0-40 

149 

2 

£190 

40-60 

30-70 

109 

3 

£190 

60-100 

50-100 

126 

4 

>190 

0-40 

0-30 

192 

5 

>190 

0-40 

30-70 

102 

6 

>190 

40-60 

30-70 

126 

7 

>190 

60-80 

50-90 

147 

8 

>190 

80-100 

70-100 

213 
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Figure B-3. 


Locations of counties assigned to 
farms , low production of com and 


Stratum 1, 
soybeans . 


small 
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Figure B-4, Locations of counties assigned to Stratum 2, small 
farms, medium production of corn and soybeans. 
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Figure B-5. Locations of counties assigned to Stratum 3, small 
farms , high production of corn and soybeans . 








Figure B-6. Locations o£ counties assigned to Stratum 4, large 
farms, low production of com and soybeans. 
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Figure B-7. Locations of counties assigned to Stratum 5, large 
farms, low production of com, medium production of 
soybeans. 
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Figure B-8. 


Locations of counties assigned to Stratum 6, large 
farms, medium production of com and soybeans. 
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Figure B-9. 


Locations of counties assigned to Stratiun 7 > large 
farms, higti production of com and soybeans. 
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Figure B-10. Locations of counties assigned to Stratum 8, large 
farms, highest production of corn and soybeans. 








importance than com, is located in the Mississippi River Valley 
where the climate and soils are more suited to soybeans ’than to corn. 

Stratum 3, the small farm stratum with the greatest production 
of com and soybeans, is located primarily in eastern Indiana and 
western Ohio where the cropland is productive, but the terrain is 
rolling. The lesser production small farm strata (strata 1 and 2) 
are centered about this area on the outskirts of stratum 3. 

In summary, looking at the geographic location of the strata, 
the system appears to be logical and the various strata seem to 
represent different conditions. This result is supportive not only 
of the variables and the methodology employed in the stratification, 
but also of the validity of the data sets employed. 

1.5 Low Density Segments 

Sample Allocation . 

The low density segments were selected to sample the variability 
present in com and soybean producing regions of the United States. 

The sample was designed to represent differences in climate, topography, 
field size, variety, and management practices. In order to achieve as 
diverse a representation as possible, an equal number of segments were 
allocated to each of the strata. This allocation scheme emphasizes 
representation of variability rather than sampling in a manner suitable 
for aggregation purposes . 

Twenty 5x6 nautical mile segments were allocated to each stratum. 
The counties to receive sample segments were determined using a random 
selection procedure without replacement. Thus, all counties in a 
stratum had an equal probability of receiving a sample and no county 
could contain more than one segment. Locations of counties receiving 
sample segments are illustrated in Figure B-11. Latitude and longitude 
coordinates of the sample segments can be found in the LARS technical 
report on this work [3j. 
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. Locations of counties in all eight strata receiving low 
density sample segments. 


Figure B-11 







Segment Location . 


Segment locations were selected using a modification of a computer 
program written for "Crop Inventory Using Full-Frame Classification", 
described in the final report of Contract NAS9-14970 (June, 1977). 

The design of the location procedure was based upon that used in LACIE. 

A grid was laid over each county with grid intersections five by six 
nautical miles apart. A random selection procedure was then used to 
select a grid intersection which determined the latitude and longitude 
coordinates of the center point of each segment. 

Although only one segment was allocated to each county, several 
segments were selected to attain a high probability that at least one 
of them would be located in an agricultural area and would be accepted 
as a site. The number of sites to be located in each county was 
determined by the percent agricultural land in the county. The segment 
centers were randomly selected without replacement and the first segment 
located outside a nonagri cultural area was to be used. 

The ag/nonag delineation was conducted by NASA/ JSC. Full-frame 
color composite Landsat imagery was used to delineate areas which 
were not agricultural. This was done on the basis of whether or not 
field patteips were apparent. Rangeland, forest, and urban areas 
were among the types of land uses which were delineated as nonag. 

Segment locations were compared with these boundaries and the segment 
was rejected if less than 5% of the segment fell into an agricultural 
area. 

1.6 High Density Segments 

Test Site Selection . 

The high density segments were designed for intensive study of the 
remote sensing technology required for corn and soybean inventories. In 
order to sample more corn and soybeans , test sites were located in the 
Corn Belt where production of both crops is high. Test sites were 



placed across th.e Corn Belt to sample the varied climatic conditions, 
soil types, crop distributions ,' and field sizes which are present 
(Figure B-12). Each test site was selected to be relatively homogeneous 
within (same stratum, similar soil types and farming practices) to 
support classification studies, particularly of multisegment training. 
Each of the sites contained about ten counties and was approximately 
the size of a crop reporting district. 

Test Site 1 is located in eastern Indiana which is an area of 
small farms. The other three test sites are located in large farm 
areas. Test Site 2 is comprised of counties in west central Indiana 
and east central Illinois. Test Site 3 is in north central Iowa and 
Test Site 4 is in west central Iowa. 

Description of Test Sites 1 and 2 . The climate across central 
Indiana and east central Illinois is continental with warm summers and 
cold winters. Normal mean temperature is-1.2*^C in January and 31.1°C 
in July. In this semlhumid region of the U.S., the average annual 
precipitation is 950 to 1000 mm which does not limit crop production. 
Rainfall is greatest during the spring and early summer months with 
June typically receiving 107 to 118 mm of rain. Average precipitation 
in June is slightly excessive, adequate in July, and often inadequate 
in August for corn. The crops survive because of some moisture stored 
in the soil profile. 

Test Site 1 is composed of two major soil associations. Soils 
of the northern two-thirds of this district (Allen, Wells, Adams, 
Blackford, Jay^ and parts of Madison, Delaware, and Randolph counties) 
belong to the Blont-Pewano-Mortley soil association. These soils were 
formed on clayey glacial till and are nearly level and poorly to 
very poorly drained. The Brookston-Crosby-Mlaml-Parr assocation which 
predominates in the remainder of Test Site 1 was formed in thin loess 
(wind-blown materials) over loamy glacial till and is also poorly drained. 
These two soil associations are suited to intensive cropping but are 
subject to problems associated with wet soils unless adequate artifical 
drainage is provided. Typically, approximately 287,700 hectares of 
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com for grain; 245,300 hectares of soybeans; and 87,300 hectares 
of winter wheat are planted. 

Test Site 2 includes dark-colored prairie soils and light- 
colored forest soils both of which were formed in loess-covered 
glacial till. Topography is generally gently rolling with short * 
slopes and nearly level areas interrupted by depressions or potholes. 

The northern one- third of this district (Newton, Jasper, Kankakee, and 
northern Ford and Iroquois counties) has soils which are sandy and 
variable in subsoil development. These soils tend to be droughty, 
low in fertility, and require a high level of management for moderate 
yields. In Tippecanoe, Benton, Warren, southern Ford and Iroquois, 
and northern Vermilion and Champaign counties in the central portion 
of the district, the soils developed under prairie or mixed prairie 
and forest vegetation, are dark to moderately dark colored, and are 
generally imperfectly drained. Crop yields are moderately high to high 
with a high level of management. Dark-colored soils on nearly level to 
moderately sloping upland areas are typical in southern Vermilion 
and Champaign counties. These soils have high available moisture 
storage capacities and are very highly productive under a high level of 
management. Farmers in Test Site 2 typically plant 667,700 hectares of 
corn; 557,200 hectares of soybeans; and 39,200 hectares of winter wheat. 

Description of Test Sites 3 and 4 . The climate in western Iowa 
is continental, characterized by marked seasonal changes. Temperature 
fluctuations are extreme with winters being cold and summers warm. 
Thirty-year normal temperatures are -8.4 C in January, the coldest month, 
and 23.6° in July, the warmest month. Annual precipitation is 762 mm 
with most of it occurring in the spring and early summer. Summer 
precipitation is variable from year to year with the largest amount (132 mm) 
generally falling in June. 

The Clarlon-Nicollet-Webster soil association, which is the only 
major soil group in Test Site 3, was derived from glacial till. About 
75 percent of the area has level to gently sloping topography and is 
well suited to intensive production of corn, soybeans, and alfalfa. 
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This test site has about 1,499,600 hectares of farm land and typically 
grows 607,300 hectares of corn (approximately 96% for grain); 477,100 
hectares of soybeans; and 54,500 hectares of alfalfa and hay. 

Three major kinds of parent materials (loess, glacial till, and 
alluvium) are found in Test Site 4. Loess (wind-blown material) from 
the Missouri. flood plains is thickest near the Missouri River and 
thins and increases in clay content in a southeasterly direction. Marshall 
and Monona— Ida-Hamburg soil associations which occupy the central three- 
fourths of this district were formed from deep loess under grass vegeta- 
tion. These soils are generally well-drained and have high proportions 
of their area used for cultivated crops. The Clarlon-Nicollet-Webster 
soil association, which is a continuation of the predominant soil of the 
third test site, is the major soil in Sac County. These soils are 
well suited to intensive production of corn, soybeans, and alfalfa. A 
third major group of soils which developed primarily from alluvial 
materials on the nearly level flood plains of the Missouri River are 
the Luton- On awa-Salix association. These soils are found primarily 
along the Missouri River in Woodbury, Monona, and Harrison counties and 
are farmed for corn, soybeans, and wheat. 

High proportions of Test Site 4 are used for cultivated crops, 
particularly corn and soybeans. Of the 1,385,100 hectares of farm land 
in this district, 634,100 hectares of corn are planted annually and 
approximately 90 percent of this corn is harvested for grain. An 
additional 233,700 hectares of soybeans are typically planted. The 
proportions of corn and soybeans vary from year to year depending on 
market conditions and prices . 

Sample Allocation . 

In general, two segments per county were allocated. In the case of 
unusually large or small counties, three segments or one segment might 
be allocated. All counties indicated in Figure B-12 received segments. 
Table B-3 lists the number of segments allocated to each county. 
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Table B-3. Allocation of sample segments to counties in each of the 
four high density test sites. 


Test Sites 

State 

County 

No. of 
Segments 

1 

Indiana 

Adams 

2 



Allen 

2 



Blackford 

2 



Delaware 

2 



Henry 

2 



Jay 

2 



Madison 

2 



Rando Iph 

2 



Wayne 

2 



Wells 

2 

2 

Indiana 

Benton 

2 



Jasper 

2 



Newton 

2 



Tippecanoe 

2 



Warren 

2 


Illinois 

Chaii^aign 

3 



Ford 

I 



Iroquois 

3 



Kankakee 

2 



Vermilion 

3 

3 

Iowa 

Calhoun 

2 



Emmet 

2 



Hamilton 

2 



Hancock 

2 



Humboldt 

2 



Kossuth 

2 



Palo Alto 

2 



Pocahontas 

2 



Webster 

2 



Wright 

2 

4 

Iowa 

Crawford 

2 



Harrison 

2 



Ida 

2 



Monona 

2 



Pottawatomie 

3 



Sac 

2 



Shelby 

2 



Woodbury 

3 
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Sample Location. 


The method used for sample selection was the same as described 
for the low density samples. More segments were located than were 
allocated to permit for loss of some segments in nonagricultural areas. 
Locations of the sample segments by latitude and longitude coordinates 
can be found in the LARS technical report on this work [sj. 

1-. 7 Summary and Conclusions 

A stratification was performed and sample segments were selected 
for an initial investigation of Multicrop problems. The effort was to 
support; 

- Development and evaluation of procedures for iislng LACIE and 
other technologies for the classification of corn and soybeans. 

- Identification of factors likely to affect classification performance. 

- Evaluation of problems encountered and techniques which are 
applicable to the crop estimation problem in foreign countries 
as well. 

The two types of samples, low density and high density, supporting 
these requirements were selected as a research data set for an initial 
evaluation of technical Issues and should not be used in an aggregation 
scheme. In summary, looking at the geographic location of the strata, 
the system appears to be logical and the various strata seem to represent 
different conditions. This result is supportive not only of the variables 
and the methodology employed in the stratification, but also of the 
validity of the data sets employed. 



2. RECOMMENDATIONS FOR DATA ACQUISITION 


The new data set required to support the objectives of the major 
study was to be acquired by NASA/JSC, In order to insure that this 
data set would meet our research objectives, recommendations were 
made by LARS to NASA/ JSC in the areas of crop inventory, periodic 
observations, and acquisition of aerial photogranhy. ■ 

2. 1 Recommendations for the Collection of Crop Information 

The material Included in the following pages was sent to 
NASA/JSC in early April, 1978. Recommendations are made for the 
sampling schemes to be followed and -the information to be acquired for 
crop inventory and periodic observations . In addition to the materials 
reproduced here, three appendices were included which displayed sample 
data recording forms; discussed in detail how to Identify the growth 
stages of corn, soybeans, and wheat; and gave guidelines for crop 
condition assessment. 



COLLECTION OF CROP INFORMATION FOR MULTICROP EXPERIMENTS 


Introduction 


USDA and NASA are conducting experiments to evaluate new methods 
of estimating crop production using Landsat Csatellite) data. An 
essential component of these experiments will be the collection of 
reliable "ground truth" or ground observations of crops in selected 
areas of the U.S. with which to develop procedures and evaluate results. 

Some 160 test sites throughout the major corn and soybean 
production regions of the U.S. have been selected for study. Two 
kinds of ground truth data will be acquired for each segment: (1) 

"x^all-to-wall" inventory of the crop identification of all fields in 
the 5x6 mile segment, and (2) periodic observations of the development 
and condition of a selected subset of fields. Specific instructions 
for each type of ground truth are given in the following. paragraphs. 

Ground Truth Sampling Methods 


A. Crop Inventory 

NASA will provide you with a recent aerial photograph of each 
5 X 6-mlle test site in your area. You should visit each field in the 
test site which is larger than approximately 5 acres and identify 
the crop or the current land use. Forms will be provided for recording 
this Information (Form 1) . Field numbers and boundaries should be 
marked on the aerial photo. 

B. Periodic Observations of Crop Growth and Condition 

Within each test site designated for periodic observations, you 
should choose a subset of fields, larger than 20 acres in size, for 
evaluation of crop growth and condition. Ten fields of com, 10 of 
soybeans, and 10 of wheat, other small grains, and pasture or hay 
crops should be selected. 
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If there are less than 10 fields of a particular crop within the test 
site, use all fields of that crop which are available. These fields 
should be sampled at 18 day intervals coincident with bandsnt passes 
from May 1 to October 30 and information recorded on the forms provided 
(Form 2) . 

1. Sampling Within A Field. 

Field sampling need not he highly complicated, but to be sure 
your sampling is reasonably accurate and unbiased, please observe the 
following guidelines : 

- Do not sample field borders, fencerows, ditchbanks or other 
similar field areas. Sampling these areas may provide 
misleading information concerning the field as a whole. 

Therefore, go into the field at least 75 feet or 30 rows before 
beginning any type of sampling procedure. 

When sampling, try to make sure the sample represents the 
entire field. Field conditions may not be uniform, therefore, 
sampling from only a small area of the field may lead to erroneous 
conclusions concerning the whole field. Look at the following 
Illustrations to see how to spread the sampling over the entire 
field. Each "x" indicates where a sat^le should be taken' for 
different field shapes. Use your own imagination for field 
shapes not shown. 





- Take the samples randomly. When you get ready to start 

sampling, do not look at a plant and say to yourself, "Hey, 

this looks like a good plant to start withi" Instead, when 

you get into the general area of the field where you need to 
» 

take a sample, look up at the sky or nearby tree, etc; look 
at anything but the crop. Walk forward five paces and start 
the sampling procedure with the plant nearest the toe of 
your right foot. Repeat this in each area of the field to 
be sampled. 

2. Evaluating Crop Growth and Condition. ‘ 

The following are instructions for completing Form 2 and evaluating 
crop growth and conditions. 

1) County and State - name of county and state of the test site. 

2) Segment Humber - the number of the test site. 

3) Date r date of this observation. 

4) Crop ID “ name of crop or cover type. 

5) Field No .- number assigned to the field. 

6) Plant^ Height - measure 5 representative plants in each of 5 locations 
in the field and record the average plant height for each location. 
Measure without extending or pulling leaves up. 

7) Percent Ground Cover - estimate, to the nearest 10%, the 
percent of ground covered by the crop canopy. 

8) Growth Stage - use the growth stage indices for com, soybeans 
and wheat and evaluate the field as a whole. 

9) Green Leaves - estimate, to the nearest 10%, the percentage of the 
leaves on the plants which are green, 

10) Crop Condition - Evaluate the quality of the crop in each field in 
terms of each of these factors which may reduce crop yields. 

A rating of "0" indicates no effect of a particular factor and 
is the most desirable condition while a rating of "4" indicates 
that severe crop losses are expected. Because these ratings 
are somewhat subjective, the guidelines in Appendix 2 are 
recommended for this study. Additional comments about crop 
condition should be recorded in the "comments" section. 
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11) Coimnents - Describe other factors which will affect the production 
of each field Ce»g*j flooded areas, herbicide damage). 

Describe and give approximate dates of any major field operations 
(e.g. , planting, cultivation, harvesting) which have occurred 
since your last visit to each field. 

The following data should only be obtained once for only the corn, 
soybean, and wheat fields being observed periodically. These data should 
be recorded on Form 3. 

1) Hybrid or Variety - For grain crops record the variety planted 
of corn (e.g., DeKalb XL 45), soybeans (Amsoy 71) or wheat 
(Arthur). For pastures and forages, record the species 
(e.g., bfomegrass, alfalfa, or orchardgrass-alfalfa mixture) . 

2) Date Planted - applies to annual crops only. 

3) N Applied - record the pounds per acre of actual nitrogen 
applied to this field. Two hundred pounds of 33-0-0 fertilizer 
equals 66 pounds of actual nitrogen. 

4) Row Width - the distance of the center of plants in adjacent 
rows. Ignore for broadcast crops and forages. 

5) Plant Population - applied to corn and soybeans only. Count the 
number of corn stalks in 50 feet of row or the number of soybean 
stems in 5 feet of row in 5 different areas of each field. These 
counts may be made anytime after all plants have emerged.- 

6) Comments - Additional descriptive Information describing the 
field. 


Note; We would also like to obtain an estimate of the grain yield of 
these fields. Separate instructions and forms for yield will 
be provided later. 
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2.2 Recoinmendatlons for the Collection of Aerial Photography 

The acquisition of aerial photography for the Multicrop high 
density segments is an important aspect of the Multicrop program. 

The aerial photography will permit objectives to be addressed concerning 
the location, number, and size of areas for training. These are 
questions which need to be answered for the optimal design of a crop 
inventory system. 

The areas which are covered by aerial photography will be 
photointerpreted and the accuracy of the photointerpretation process 
will be checked with the wall-to-wall ground truth on those high density 
segments which are also covered by the flightlines. If aerial photography 
flightlines, ground truth over high density segments, and multitemporally 
registered Landsat full frames are available, a study of training 
procedures can draw upon these photointerpreted areas to look at dispersion 
of training areas throughout the area to be classified, the optimal 
total amount of training, and how this amount should be divided into 
size and number of segments. 

To achieve these objectives, the aerial photography acquisition 
should follow these specifications: 

Location . Four high density test areas have been located in 
eastern Indiana, west central Indiana/east central Illinois, north central 
Iowa, and west central Iowa. Three or four flightlines should be 
flown for each of the test sites totaling an average of about 400 
miles per test area. The flightlines should be located such that the 
aircraft will cover exactly the same area each time. 

Type of photography . Nine-inch color infrared photography with 
a 20 percent forward overlap should be acquired. It should be flown 
at an altitude and scale such that a strip of land four to five miles 
wide is covered. 
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Times of acquisition . The photography should optimally be acquired 
at three times during the growth season: May-June, July, and August. 

The early mission will provide coverage when corn and soybeans have a low 
percent soil cover to separate them from other cover types. The 
crops will also be sampled twice during their growth to permit separation 
of corn from soybeans. If only two missions can be acquired, these 
should be in the August and June-July time frame. I-f only a single mission 
can be flown, it should be in August. 



3. EVALUATION OF PROCEDURE 1 FOR COEN AND SOYBEANS 


3. 1 Introduction 


An analysis procedure known as Procedure 1 CP~1) was developed 
for use during the Large Area Crop Inventory Experiment (LACIE). 

The procedure encompasses the areas of training, classification, and 
area estimation and emphasizes the use of multitemporal information. 

In order to allow for extension of the LACIE procedures into foreign 
countries, ground reference data were not used for training, but 

analysts labeled training data by image interpretation. Procedure 1 
utilizes a random grid selection technique to locate pixels (dots) 

on segment imagery. Analysts label dots which fall on grid intersections. 
These dots are of two types; Type 1 dots which are used for starting 
the clustering algorithm and labeling clusters and Type 2 dots which 
are used as bias correction dots. This approach reduced analyst time 
significantly, allowing the analyst to concentrate on just the labeling 
operation. In addition, this method of selecting training areas should 
be unbiased which is an advantage over analyst-selected training data. 

Another aspect of Procedure 1 which is designed to reduce bias 
is in the use of designated other CDO) and designated unidentifiable 
(DU) areas. If an area is clearly not of Interest (e.g., woods for 
a wheat Inventory) , that area is labeled DO to prohibit any of that 
area from being classified as the crop of interest. If an area is 
covered by clouds, it is labeled DU and proportion estimates, made do 
not include this area. 

A clustering algorithm is used to statistically define the training 
classes. Type 1 dots are used as starting vectors for the clustering 
algorithm and are also used to label the resulting clusters. The 
clustering algorithm which is used is the Iterative Self-Organizing 
Clustering System Processor (ISOCLS) . Then a sum of densities classifier 
is used. The classification results are considered as a stratifica- 
tion of the segment into the various classes of interest. The stratified 
area estimate is then computed using the Type 2 dots to make proportion 
estimates in an unbiased way. 



The objective of LACIE was to estimate wheat production in Important 
wheat growing regions of the world. Recently, there has been increasing 
emphasis on making production estimates for crops other than wheat; in 
particular, corn and soybeans are the two crops of .immediate interest. 
This preliminary study has been using currently available data to 
evaluate the LACIE procedures when applied to corn and soybeans and to 
recommend changes in the procedures for the new classification 
problem. In addressing these issues , this task supports the classifi- 
cation component of the corn and soybeans research effort. 

3.2 Objectives 


The overall objective of this investigation is to advance the 
development of large area crop inventory systems for raulticrop regions 
by applying and evaluating recently developed techniques. This preliminary 
study addresses parts of this objective with currently available 
data. The specific objectives of the preliminary study are: 

- Evaluate the LACIE Procedure I (P-1) for a corn, soybeans, 
and "other" crop identification problem. 

- Investigate parameter changes which may improve the 
performance of P-1 on corn and soybeans. 

3.3 Approach 

The preliminary study used data on com and soybeans which was 
acquired during- the CITARS project. Assessment of the accuracy and 
variability of P-1 estimates was done with minimal changes to the 
procedure to work the three class problem. Each CITARS segment was 
divided into four 5x5 mile blocks for analysis because this is as 
close to the LACIE segment size as possible using only the segment, data. 

The ratio of dots to total area to be classified was about the same 
as in LACIE. A key aspect of the approach was that ground truth 
or photointerpreted areas were used rather than analyst— labeled dots . 

This permitted evaluation of the analysis procedure itself rather than 
the image Interpretation accuracy. Dot grids falling in areas with 
reference data (ground truth or photointerpreted crop types) were 
digitized and the pixels were associated with ground truth labels. 



Both classification and area estimation accuracy were assessed and the 
variability resulting from different choices of Type 1 and Type 2 dots 
was estimated. 

In an initial assessment of Procedure 1, analyses were conducted 
using parameters which had been used in LACIE. It was believed that 
these settings would not be optimal for the com, soybeans, and "other" 
crop identification problem due to differences in the spectral distribution 
of the crops of interest from that of wheat, more confusion crops, 
differences in crop calendar, and other factors. Therefore, a parameter 
study was initiated to begin investigation of some of these issues. 

3.4 Results of Initial Evaluation 


A major accomplishment of this task is that personnel from LARS 
have become familiar with the philosophy and methodology of Procedure 
1 and the P-1 software implemented on the Purdue/LARS IBM 370/148 
computer system. Personnel from NASA/ JSC, Lockheed, and LARS have 
worked cooperatively to standardize and improve the P-1 software. 

In. support of this task, personnel attended the LACIE Symposium, 
held October 23-26, 1978, at the Johnson Space Center, to further 
their knowledge and understanding of the state of the art. A represen- 
tative from this task attended the Advanced Seminar in Multicrop 
Labelling from Landsat Multitemporal Data, held November 1-8, 1978, 
at the University of California, Berkeley. Analysts have> been partici- 
pating in a series of workshops to leam how to apply all the clustering 
and classification routines which are implemented on the LARS computer. 
These experiences, coupled with data analysis utilizing the Procedure 1 
software, have provided a well-rounded background in crop inventory 
procedures for the participants. 

Identification of General Crop Inventory Issues 

Early in the investigation, several general issues in crop inventory 
were identified. The general methodology for inventory of corn and 
soybeans needs to be of somewhat different design than for wheat. For 
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example, in corn and soybean production areas, the practice of double 
cropping, particularly soybeans following winter wheat, is becoming 
increasingly important. A methodology for identification and 
classification of double cropped areas needs to be developed. 

Cloud cover is a greater problem in the Corn Belt than in the 
U.S. Great Plains; this has potential impact on the handling of ~ 
designated unidentifiable CDU) areas. If only the area which is 
cloud-free on all four acquisitions is used for area estimation, 
insufficient pixels may be available to give accurate and precise 
estimates for the segment proportions. If DU areas exceed a certain 
percentage of land area in a segment, perhaps three cloud-free 
acquisitions could be used to classify some additional areas to 
provide a broader base for area estimation. 

Variability of Procedure 1 Estimates 

An analysis was run to look at the variability of the stratified 
area estimates due to the location of the dot grids. In Procedure 1, 
two types of dots are selected. Type 1 dots are used to start the clusters 
and label them and Type 2 dots are used for bias correction. Both 
types are located on a systematic sample grid. Five grids were defined, 
two Type 1 grids and three Type 2 grids, giving a total of six grid 
combinations for analysis. Using these grid combinations, six analyses were 
run keeping all other parameters and procedures constant. 

For an individual section, there was a significant amount of 
variability amoung the six estimates. Table B-4 gives an example of the 
variability encountered for one section. There appears to be more 
variability between grids of Type 1 dots than between grids of Type 2 
dots. The interaction between grid types is also significant. This is 
best illustrated by the soybean estimates where there is a greater effect 
of Type 2 dot selection for the first selection of Type 1 dots than there 
is for the second selection. 


The results in' Table B-5 are more indicative of the amount of 
variability which might be noticed in practice since, in general, the 
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Table B-4. Proportion estimates of corn and soybeans for section 61 
in Livingston county. 


Com Soybeans 


Type 2 Grid , 

First Set 
of Type 1 

Second Set 
of Type 1 

First Set 
of Type 1 

Second Set 
of Type. 1 

A 

33.5 

38.9 

60.0 

56.7 

B 

30.4 

40.0 

61.6 

54.6 

C 

27.0 

42.4 

71.4 

52.4 


Table B-5. Averages of proportion estimates of corn and soybeans for 
eight sections in Livingston county. 


Cora Soybeans 


Type 2 Grid 

First Set 
of Type 1 

Second Set 
of Type 1 

First Set 
of Type 1 

Second Set 
of Type 1 

A 

32.7 

36.3 

59.8 

56.9 

B 

31.7 

40.9 

61.1 

54.9 

C 

30.1 

46.1 

67.7 

48.4 



interest in estimation is for larger areas. In this case as well, the 
choice of dot grids does significantly affect the final stratified area 
estimates. 

This study was conducted using 30 and 40 dots for Type 1 and 
Type 2, respectively. It is possible that using more dots could somewhat 
alleviate this problem, but insufficient ground truth was available for 
pursuit of this idea. This study does indicate, however, that final 
estimates can be significantly affected by selection of dots. It is 
necessary, therefore, to insure that: (1) the Type 1 dots represent 
the spectral subclasses present in the scene and (2) the proportions 
of Type 2 dots in each category are similar to the distribution of cover 
types in the area to be classified. Further study into these effects 
needs to be conducted to determine a methodology to remove this 
variability. 

Effect of Distributions of Dots on Area Estimates 


The LACIE procedure samples a random set of dots falling on a 
systematic grid over the segment. Type 1 and Type 2 dots are selected 
on different grids of the same type. The rationale of this sampling 
scheme is that the true distribution of crops present will be sampled 
in their respective proportions. When using the CITARS data, however, 
dots were sampled only from areas with reference data. The dots were 
not distributed throughout the segment, but a higher density of dots 
was sampled in a smaller portion of the segment which had available 
ground truth information. Since the areas sampled were either sections 
or quarter sections, the distribution of cover types present would 
probably not be as diverse as if the same number of dots were spread out 
over a larger geographic area. Table B-6 illustrates this problem. 

By selecting this type of sample, it frequently occurred that one 
of the categories had very few pixels for starting clusters (maybe 
only three or four) . These were Insufficient to completely represent all 
the spectral subclasses which might be in a category. Theory indicates 
that the final estimates can be highly dependent upon the distribution of 



Table B-6. Comparisons of proportions of pixels with ground truth 
available to county crop proportions. 


County . 

Block 

Corn 

Proportion 

Soybeans 

Other 

Fayette 

1 

7.8 

49.7 

42.4 


2 

25.2 

44.9 

29.9 


3 

24.5 

68.3 

7.2 


County* 

14.2 

23.8 

62.0 

Livingston 

1 

29.5 

67.3 

3.2 


2 

41.3 

54.1 

4.6, 


3 

9.4 

12.2 

78.4 


County* 

38.6 

37.7 

23.7 


*USDA/SRS County Acreage Estimates for 1972 



the Type 2 or bias correction dots. For these two reasons, a test was 
done to determine to what extent the dot distribution affects the 
final stratified area estimates. 

Using the same set of parameters .for clustering and classification, 
the first block of Livingston county was classified twice using two 
different distributions of dots. The first distribution, which will 
be referred to as the random distribution, was obtained ,by selecting 
dots as they fell on the grid. The second distribution, which will 
be referred to as the proportional distribution, was obtained by 
selecting dots from the grid with about the same proportion for 
each category as historical crop proportions for the county indicated. 
This approach counters the difficulties of having enough dots to 
represent the numerous spectral subclasses in each category and the 
bias which could be induced in the stratified area estimate. 

Com and soybean estimates made by each method were compared on 
several test areas for which true proportions were known using a 
paired t-test. The results of this comparison are presented in Table 
B-7. For both corn and soybeans, the proportion estimates derived 
by the two methods differed significantly at the one percent level. 

The estimates generated using proportional dot distributions did not 
differ significantly from the ground truth proportions. The random 
distribution estimates, however, differed significantly from the 
ground truth for soybeans (at the one percent level) and for corn 
(at the 15 percent level) . 

Based upon theory and these empirical results, it seems that a 
methodology should be developed to insure dot distributions which 
are representative of the distributions of cover types in the area 
to be classified. One possible solution to this problem is to 
sample from the spectral space rather than from the physical space. 

Effect of Number of Dots on Area Estimates 


Although this particular aspect of the procedure has not been 
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Table B-7. Effect of distributions of Type 1 and Type 2 dots on 
proportion estimates for Livingston cotmty. 




Proportion 


Crop 

Random 

Distribution 

Proportional 

Distribution 

ground 
Truth . 

Com 

26.2 

34.7 

30.4 

Soybeans 

71.2 

38.0 

39.5 
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extensively investigated in this task, it is believed that jnore dots 
should be used than were used in LACIE Phase III. The scene in the 
Com Belt may be more complex than scenes of primarily wheat, 
indicating that with a one pass cluster routine more starting dots are 
needed to create a sufficient number of clusters to represent the 
scene. An increased number of Type 2 (bias correction) dots would 
result in a further variance reduction of the area estimates. The 
transition year (TY) analysis procedures at JSC call for a minimum of 
40 Type 1 and 60 Type 2 dots rather than the 30 and 40 required in 
Phase III. 

It is possible that even more dots might result in significantly 
more accurate and precise area estimates. It appears, however, 
that a judicious choice of dots and a good selection of clustering 
and classification parameters will provide a greater improvement in 
results than merely selecting more dots, 

3. 5 Summary and Future Plans 

Progress has been made in Identifying areas of difficulty in the 
classification of com and soybeans. Dot distribution, number of dots, 
and parameters used in clustering and classification of the data seem to 
be significant factors. These analyses have been based on a small 
data set; analysis of additional data may confirm or contradict results 
obtained to date, possibly altering conclusions which may be drawn from 
the analyses. In many ways, this section should be viewed as a status 
or preliminary report of our results rather than a final report. 
Continuation and completion of the analyses described here, as well as 
additional analyses, are planned in the new SR&T contract to LARS. 

This task is continuing into a second year which will address the 
objectives given in the Introduction to this report, although the 1978 
crop year data will not be available at the beginning of the contract. 
The work to be accomplished during the period before these data become 
available includes planning specific analyses to be conducted with the 
new data and a continuation of the P-1 study using CITARS data for a 



parameter investigation. Multitemporally registered Landsat data and 
digital ground truth tapes for 5 x 6 nm sample segments will be 
available for some high density segments in Indiana, Illinois, and 
Iowa. It is these data on which analyses will be conducted to investi- 
gate training, classification, and area estimation procedures. 
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C2. Multisensor Multidate Spatial Feature 
Matching, Correlation, Registering, 
Resampling and Information Extraction* 


1. INTRODUCTION 


This subtask was formulated to seek answers to the problems of data 
merging and information extraction using multiple remote sensing and 
ancillary data types and to develop techniques for merging and analysis 
of certain data types using the results of this research. The specific 
remote sensing data types considered in this contract year are synthetic 
aperture radar (SAR) and Landsat data. Methods of merging map data and 
remote sensing data' are also considered. Interest is growing in the 
remote sensing community in the utility of radar imagery as an addition 
to Landsat data. The tasks are oriented toward- determining the spatial 
and spectral characteristics of SAR data and definition of merging 
system parameters. 

2. DATA SET SURVEY AND ACQUISITION 

The study was formulated on the assumption that three aircraft SAR 
data sets would be obtainable by at least the end of the second quarter. 
These were to be flights over the Salisbury, Maryland area. Gulf Coastal 
Zone area, and over the Phoenix, Arizona area. A Salisbury SAR flight 
is in house; however, it is of poor quality and a reprocessed data set is 
being prepared but has not yet been received as of November 15, 1978. 
Landsat data for Salisbury is on hand. The Gulf coast flight has not 
been flown due to SAR equipment problems on the NASA aircraft and may 
be flown and made available in the late fall of 1978 or spring of 1979. 
The radar data for the> Phoenix site is on hand; however, the time 


* This report is on the work under Task 2.2C Multisensor Radiometric 
Correction Correlation and Applications Analysis. 
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coincident Landsat data which was ordered in March 1978 has not been 
received as of November 15, 1978. Thus, none of the expected data sets 
is complete as planned. 

In order to permit spatial distortion investigations to proceed the 
high noise level SAR from Salisbury, MD was used, and a second eastern 
Maryland shore SAR flight data set near Cambridge, Maryland was also 
used. These data sets were generated as part of NASA Contract NAS6-2816 
from the Wallops Flight Center and are being used in this study to 
enable extension of the work on geometric characteristics of SAR/Landsat 
imagery. The characteristics of the registered data sets resulting 
from this previous study are listed in Table C-1. 

The Phoenix, Arizona data set which was to be the primary one for 
the first year of the study could not be completed due to Landsat CCT 
data availability problems. Landsat scene #5 792-16152, June 19, 197'?, 
was ordered in March 1978. On September 1, 1978 LARS was informed 
by EDC that the frame was "unavailable" even though LARS had on hand 
high quality imagery for the frame. The meaning of "unavailable" was 
explored and it was determined that ancillary data record problems 
existed but the Imagery was readable. A request for the imagery only 
portion of the tape was made by late October 1978 and LARS is awaiting 
delivery of this data. Extensive ground truth was gathered in the 
Phoenix area in March 1978 and thus, it is of great interest to complete 
the data set so that analysis can be conducted. 

In order to proceed with registration studies a fall 1972 Landsat 
frame was used to register with the SAR data. The analysis reported 
here was based on these time separated data s'ets. 

3. AIRCRAFT/SAR SPATIAL/SPECTRAL MODELING 

The spatial distortion characteristics of the three SAR data sets 
were investigated with respect to Landsat as a reference. Three distor- 
tion model sources were utilized. One consists of a systematic error- 



Table C-1 


Merged SAR/Landsat Data Set Description. 


Data Set 
No. 

Site 

Identifier 

Date of 
SAR FLIGHT 

Landsat 

Frame/Date 

LARS Data 
Set No. 

Number of 
Lines 

No. of 
Samples/Line 

Pixel 

Size 

No. of 
Channels 

Tape 

No. 

File 

No. 

1 

Salisbury, 

Maryland 

August 22, 
1976 

2579-14535 
August 23, 
1976 

76016404 

2700 

1906 

25.4 
X 25.4 
M 

5 

3620 

1 

2 

Cambridge, 

Maryland 

August 22, 
1976 

2579-14535 
August 23, 
1976 

76016413 

681 

598 

25.4 
X 25.4 
M 

7 

3692 

1 

3 

Phoenix, 

Arizona 

June 17, 
1977 

1085-17330* 
October 16, 
1972 

72069110* 

512 

512 

25x25M 

7 

160* 

1* 


* will change when 1977 Landsat data is received 




analysis program developed by Goodyear Aerospace for NASA. The second 
consists of affine and biquadratic models generated by LARS as part of 
the image registration system. The third source is the SPSS statistical 
analysis package included in the LARS system programs which utilizes first 
through fifth degree polynomial representation of distortion. The 
systematic model was shown to be equivalent to the affine model (see 
Appendix I) and is thus, not exercised in the cases discussed. 

The geometric distortion of the SAR imagery relative to Landsat was 
analysed by visually selecting control points from imagery of both data 
types and processing the points with the various distortion analysis 
programs. The results of these analyses are described by site. 


3.1 Salisbury Data 

The results of the distortion model analysis for the Salisbury, 
Maryland, data set are shown in Table C-2. Since the SAR image is 
very noisy the residuals for the model are large. The errors are 
approximately the same in both reference frames because there is only a 
slight scale difference between the original Images. The regression 
modeling improves with increasing degree in general. There are some 
anomalies in this trend shown between the biquartlc and blqulntic 
models. This effect is probably due to the addition of non-signifi- 
cant terms to the regression while decreasing the degrees of freedom 
in the data. Using the affine models, a parametric description of the 
SAR imagery relative to the Landsat image are obtained. They are as 
shown in Table C-3. 


Table C-3, Parameters for Salisbury SAR-Landsat Distortion Model 


Line Translation 
Column Translation 
Line Scale Factor 
Column Scale Factor 
Angle of Rotation 
Shear 

or Shear Angle 


4.13 
-1031.51 
1.01 
] .05 

-16.16 Degrees 
-0.04 

-2.04 Degrees 
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Table C-2. Evaluation of Salisbury Overlay Models 


Distortion 

Model 

// of 
Terms 

# of 
Points 

Affine 

3 

34 

Biquadratic 

■ 6 

34 

Bicubic 

10 

34 

Biquartic 

15 

34 

Biquintic 

21 

34 

Landsat Grid Size 

25M 

SAR Grid Size 


25M 

Registered Grid 

Size 

25M 


Residuals in Reference Frame 
SAR — - — — Landsat 


Line 

R.M.S. 

Column 

R.M.S. 

Line 

R.M.S. 

Column 

R.M.S. 

11.15 

3.54 

10.87 

3.74 

11.41 

3.52 

11.18 

3.58 

6.40 

3.40 

6.52 

2.87 

5.65 

1.75 

5.54 

2.15 

4.73 

1.93 

4.41 

2.22 


X 25M, 
X 25M. 
X 25M. 
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3.2 Cambridge Data Set 

Using the Affine model, parameters for the distortion of the SAR 
image relative to the Landsat image were computed and are as shown in 
Table C-4. 

Table C-4. Parameters for Cambridge SAR-Landsat Distortion Model 


Line Translation 

— 

379.97 

Column Translation 

= 

209.88 

Line Scale Factor 

= 

0.35 

Column Scale Factor 

= 

0.40 

Angle of Rotation 


12.91 Degrees 

Shear 

= 

-0.01 

or Shear Angle 

- 

-0.73 Degrees 


The results for the regression modeling of the Cambridge distortion 
are given in Table C-5. Because of the large scale difference between 
reference frames, the residual errors differ greatly between reference 
frames. These differences can be accounted for by scaling the residuals. 
The circular error in the Landsat reference is approximately equal to 
the scaled circular error in SAR reference, i.e., 

^^'^LSAR^ ^^C*^CSAR^ " LLANDSAT ^ CLANDSAT 
where L and C refer to line and column respectively. 

Again the higher degree polynomial regressions model the misregistra- 
tion more closely. To obtain the 47 point data set used, the points of 
the 51 point data set whose residuals were greater than twice the standard 
deviation of the residuals were regarded as bad data points and deleted. 
The residuals subsequently obtained are reduced significantly. 

3.3 Phoenix Data Set 


The data set of primary interest in the study is Phoenix since data 
quality is high and extensive ground truth is available. Figure C-1 



Table C-5. 


Evaluation of Cambridge Overlay Models 


Distortion 

Model 

// of 
Terms 

// of 
Points 

Affine 

3 

51 


3 

47 

Biquadratic 

6 

51 


6 

47 

Bicubic 

10 

51 


10 

47 

Biquartic 

15 

51 


15 

47 

Biquintic 

21 

51 


21 

47 

LANDSAT Grid Size 

25M. 

SAR Grid Size 


8.7M. 

Registered Grid Size 

25M. 


Residuals in Reference Frame 
SAR Landsat 


Line 

R.M.S. 

Column 

R.M.S. 

Line 

R.M.S. 

Column 

R.M.S. 

11.34 

5.77 

3.85 

2.39 

7.04 

4.36 

2.41 

1.81 

10.95 

5.47 

3.73 

2.32 

6.51 

4.04 

2.24 

1.67 

11.13 

5. *23 

3.78 

2.24 

6.16 

3.79 

2.09 

1.63 

11.51 

5.41 

3.89 

2.32 

6.16 

3.79 

2.09 

1.63 

11.62 

5.45 

3.88 

2.45 

6.37 

3.74 

2.17 

1.62 


: 25M. 

X 10. OM. 
; 25M. 




Figure C-1. Goodyear SAR Data Set over Phoenix, Arizona used in the 
study. Flown on June 17, 1977 using an AN/APD-10 X band 
radar in an Air Force RF-4 aircraft. Area covered in 
approximately 12 by 38 miles at a resolution of approxi- 
mately 10 feet. 
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shows the entire SAR data for the Phoenix area. Two agricultural areas 
exist at each end of the flight. The scene was scanned and digitized on 
an Optronics microdensitometer by NASA Wallops and reformatted at LARS 
into a LARSYS data set. Only an annotated film product was available at 
the time of processing thus, the annotations appear in the data set. This 
should not cause degradation of data quality in areas not by the annota- 
tion. Figure C-2 contains Landsat frame 1085-17330 which was used as a 
reference in this study. 

Checkpoints were manually determined in both data sets. The Pearson' 
product-moment correlation was used to obtain a measure of the dispersion 
of the checkpoints over the scene. If the correlation is small, then 
the dispersion is good. The Pearson's product moment for the chosen data 
points was -0.0957. The results of the regression distortion analysis 
is shown in Table C-6. The scale difference between the original Landsat 
and SAR Imagery is much greater in the Phoenix data set than in the two 
previous. Using the affine distortion model. Table C-7 was constructed 
which specifies the distortion in the SAR imagery relative to the Landsat 
image. 


Table C-7. Parameters for Phoenix SAR-Landsat Distortion Model 


Line Translation 
Column Translation 
Line Scale Factor 
Column Scale Factor 
Angle of Rotation 
Shear 

or Shear Angle 


-5219.94 

2658.08 

5.16 

4.28 

61.47 Degrees 
0.03 

2.01 Degrees 


The circular error in the SAR reference frame is again related to the 
circular in the Landsat reference by the relation. 


^^L°LSAR^ 


^^c'^CSAR^^ 


0 “^ + 
LLANDSAT 


CLANDSAT 


Also in Table C-6 the difference in results obtained using different 
algorithms and computers to implement the regression are Illustrated. 
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Figure C-2. Landsat frame 1085-17330 used as reference in the study. 
Imaged on October 16, 1972. 



Table C-6 


Evaluation of Phoenix Overlay Models 


Residuals in Reference Frame 





SAR 


Landsat 

Distortion 

// of 

if of 

Line 

Column 

Line 

Column 

Model 

Terms 

Points 

R.M.S. 

R.M.S. 

R.M.S. 

R.M.S. 

Affine 







SPSS/CDC 

3 

17 

3.89 

3.91 

0.91 

0.66 

SPSS/IBM 

3 

17 

3.89 

3.91 

0.91 

0.66 

LARS /IBM 

3 

17 

3.53 

3.58 

0.90 

0.81 

Biquadratic 







SPSS/CDC 

6 

17 

3.02 

3.37 

0.67 

0.67 

SPSS/IBM 

6 

17 

3.02 

3.37 

0.67 

0.67 

LARS /IBM 

6 

17 

2.72 

2.92 

0.54 

0.66 

Bicubic 







SPSS/CDC 

10 

17 

2.53 

1.54 

0.21 

0.65 

SPSS/IBM 

10 

17 

2.53 

1.54 

0.21 

0.65 

Biquintic 







SPSS/CDC 

10 

17 

— 

— 

0.28 

0.62 

SPSS/IBM 

10 

17 

3.15 

0.07 

0.28 

0.62 

LANDSAT Grid 

Size 

- 76. 

2M. X 61.10M. 





SAR Grid Size 


14.8m. X 14. 2M. 


Registered Grid Size 


25M. X 25M 
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The residuals shown for the SPSS packages (Statistics Package for the 
Social Sciences) are larger than those for the LARS Affine and Biquadratic 
programs. This is probably due to the loss of precision in computing the 
inverse matrix in the LARS program. The differences in the residual 
calculated between the SPSS program implementations are due to the 
precision of the machine used. The IBM/370 version uses a 32 bit word 
and the CDC 6500 a 60 bit word. These differences become evident first 
in the higher degree regressions. 

The results indicate that for the small areas considered that the 
linear models do as well as higher degree models for representing 
distortion in the SAR imagery. The Salisbury data set was observed to 
have oscillatory scale errors and is probably not representative of 
typical flight data. The R.M.S. errors for the other two sites did 
not significantly decrease for the higher order cases. 

The biquadratic error results are in the half reference pixel range, 
thus, the current LARS registration system can implement the SAR distortion 
representation. Thus, the SAR and Landsat data were registered using 
the biquadratic results. A block of data covering the agricultural area 
between Sun City and Phoenix was registered producing a 512 x 512 block 
of data. The Landsat data was interpolated using cubic convolution to 
25 meter pixels. The results are shown in Figures C-3 and C-4. Figure 
C-3 is the interpolated Landsat data for band 5 and Figure C-4 is the 
SAR for the same area. 

4. SATELLITE SAR SPATIAL/ SPECTRAL MODELING 

Data availability problems with aircraft cases resulted in the 
decision to include the satellite SAR case in future studies. Resources 
were placed on the aircraft SAR problem and other system aspects. 
Information on SEASAT SAR and other satellite SAR sensors was acquired 
and reviewed during the year, but no satellite data obtained. It is 
highly likely that the technology developed for the aircraft SAR cases 
will be useful in dealing with satellite SAR data, thus, the essence of 
this task is considered to be fulfilled by other results reported here. 
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Figure C-3. Landsat Image, Channel 2 (0.6-0. 7 ym) , Cubic Resampling to 
a 25 X 25 Meter Resolution (Phoenix, AZ) . 
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Figure C-A. 


Aircraft SAR (3 cm). Cubic Resampled and Registered to 25 x 25 
Meter Landsat (Phoenix, AZ) . 
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Resources which would have been expended on the satellite SAR case 
were directed toward further studies of the aircraft data. Control 
point location is a difficult task and visual methods are time consuming 
and inaccurate. A correlation study was conducted to see if numerical 
control point finding methods would work on SAR/Landsat image pairs. 

Figure C-5 contains correlation matrices for seven 101 by 101 
pixel blocks from the registered Phoenix data set. The correlation of 
each of the four Landsat bands with the SAR is given in the fifth row 
of the matrix (the one labeled spectral band 3. 0-3.0, refers to the 
nominal 3 cm wavelength of the SAR) the highest correlation in any of 
the blocks is -.43 for SAR versus band 5 (.6-. 7 pm) in block four. 

There is a five year time difference in the Landsat and SAR; however, 
field structures are still very much the same and this correlation figure 
is typical of what has been observed for other sites with time coincident 
data. The purpose of this test was to see if gradient enhancement would 
increase the correlation. 

Magnitude of gradient image transf omnatlon was made on band 6 of 
the Landsat and the SAR and added as channels 6 and 7 respectively as 
the registered data set. Block correlations were performed on the two 
gradient channels and the results presented in Figure C-6. In these 
tests the maximum correlation observed was .15. Correlations of either 
gradient with any of the unprocessed channels was not significantly 
higher. A gray scale image of the gradients for each data type are 
shown in Figures C-7 and C-8. These results are very unfavorable and 
indicate that numerical control point finding may not be possible. 
Observation of the gradient Images Indicates considerable agreement 
between roads and field edges and suggests some scheme may work for 
SAR correlation. Similar analysis was carried out for the Maryland 
data sets with equally* poor results. The experiments will be repeated 
when time coincident' ‘data is obtained for Phoenix. 

The primary intended purpose of the SAR registration is to enhance 
crop classification performance over that obtained with Landsat above, 
without time coincident Landsat data this could not be tested. However, 
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FIELD 1 

run no, 72069109 
OTHER INFORMATION 


CORRELATION MATRIX 


SPECTRAL 0 
BAND 

.50 - 
0.60 

0.60 - 
0,70 

1 o 
CO 

o • 

• 

o 

0.80 - 
1.10 

0,50- 

0,60 

1.00 




0.60- 

0,70 

0.95 

1.00 



0.70- 

0,80 

0.68 

0.64 

1.00 


0.80- 

1.10 

0.29 

0.24 

0.86 

1.00 

3.00- 

3.00 

-0.20 

-0.22 

-0.07 

0.03 

LINES 20- 

COLUMNS 1- 

120 

101 

(BY 1) 

(BY IJ 



FIELD 2 T 
RUN NO, 72069109 N 
OTHER INFORMATION 


CORRELATION MATRIX 


SPECTRAL 

BAND 

0,50 - 
0.60 

0.60 - 
0.70 

0.70 - 
0.80 

0.80 - 
1.10 

0.50- 

0.60 

1.00 




0.60- 

0.70 

0.95 

1.00 



0.70- 

0.80 

0.81 

0.86 

1.00 


6.80- 

I.IO 

0.50 

0.54 

0.86 

1.00 

3,00- 

3.00 

-0,08 

0.00 

-0.05 

-0.10 


LINES 1- 101 (BY 1) 

COLUMNS 220- 320 (BY I) 


TYPE 

NO, OF SAMPLES 10201 


3,00 - 
3,00 


1,00 


PE 

, OF SAMPLES 10201 


3,00 - 
3.00 


1.00 


Figure C-5a. Correlation Matrices for Sample Fields 1 and 2. (Phoenix, 
AZ; Channels 1-4, Landsat; Channel 5, Aircraft SAR) . * 



FIELD 
RUM NO 
OTHER 



TYPE 

NO' OF SAMPLES 


CORRELATION MATRIX 

SPECTRAL 0o50 - 0.60 - 0.70 - 0.80 - 3.00 - 


BAND 0.60 

0.70 

0.80 

1.10 

3.00 

0.50- 

0.60 1.00 





0 .60 — 

0.70 0.98 

1.00 




®8lEo 0o59 

0.52 

1.00 



0o80- 

1.10 0.04 

-0.06 

0,79 

1.00 


3.00- 

3.00 -0.09 

-0.09 

0,07 

0.15 

1.00 

LINES 70- 

COLUMNS 340- 

170 (BY 1) 

440 (BY U 



FIELD^ 4 

RUN NO. 72069109 

OTHER INFORMATION 




TYPE 

NO. OF SAMPLES 

CORRELATION MATRIX 





SPECTRAL 0.50 - 
BAND 0.60 

0.60 - 
0.70 

0.70 - 
0.80 

0.80 - 
1.10 

3.00 - 
3.00 

0.50- 

0.60 1,00 





0.60- 

0.70 0.86 

1.00 




0.7Qr 

0.60 0.78 

0.81 

1.00 



O.BO- 

1.10 -0.20 

0.12 

0.27 

1.00 


3.00- 

3.00 -0.37 

-0.43 

-0.29 

0.03 

1.00 

LINES 180- 
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Figure C-5b. Correlation Matrices for Sample Fields 3 and 4 (Phoenix, AZ). 
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Figure C-5c. Correlation Matrices for Sample Fields 5 and 6 (Phoenix, AZ) . 
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Figure C-5d. Correlation Matrices for i 

Sample Fields 7 and 8 (Phoenix, 

AZ). 
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Figure C-5e. Correlation Matrix for Sample Field 9 (Phoenix, AZ) . 
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Figure C- 

-6a. Correlation Matrices for 
(Phoenix, AZ; Channel 1-4 

Gradient of 
, LANDSAT; 

Fields 1 and 2. 

Channel 5, SAR; Channel 6, 


LANDSAI^ Channel 3, Gradient; Channel 7, SAR Gradient). 
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Figure C 

-6b. Cprrelatipn Matrices for 
(Phoenix, AZ). 

Gradient of 

Fields 3 

and 4. 
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-0.35 
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Figure C-6c. Correlation Matrices for Gradient of Fields 5 and 6 (Phoenix, 
AZ). 
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Figure C-6d. Correlation Matrices for Gradient of Fields 7 and 8 (Phoenix 
AZ). 
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0.0 

0.50- 

0.60 

1.00 






0.60- 

0.70 
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Figure C-6e. Correlation Matrix for Gradient of Field 9 (Phoenix, AZ) 
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Figure C-7. Magnitude of Gradient for Landsat Channel 3 (0.7-0.8ym) 
(Phoenix, AZ) . 
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Figure C-8. Magnitude of the Gradient for Aircraft SAR (Phoenix, AZ) 


since ground truth was available for Phoenix, data statistical analysis 
was performed to examine the separability of crops in the SAR channel 
(ch 5). Figure C- 9 contains correlation matrices for the classes: 
alfalfa, barley, cotton, onions, sugar beets, urban and wheat. Figure 
C-10 contains histograms for these classes and C-11 contains their 
spectral plots. Only the last row (spectral band 3. 0-3.0) is significant 
to the ground truth. The four Landsat bands are included to provide a 
typical crop vegetation comparison but the contents of the fields on 
October 16, 1972 are unknown. The SAR data shows some discrimination for 
cotton, barley and urban with alfalfa, wheat, sugar beets and onions 
having similar means and variances. These judgements are based only on 
histogram inspection and detailed analysis can only be done after the 
time coincident Landsat data is available. 

5. GENERAL MULTIDATA MERGING SYSTEM STUDY AND MULTIDATA MERGING SOFTWARE 
AND DATA SET GENERATION 


These two tasks were fulfilled within the scope of the aircraft SAR 
analysis performed and were not followed as separate task timelines 
except for the case of ancillary data. The project included consideration 
of ancillary data merging problems and this was not studied until the 
fourth quarter. 

The process of manually digitizing a complex polygon map is slow, 
error prone and requires costly and often unreliable grldding of digitized 
arcs. An alternate method of map digitizing was described in the June 
1976 LARS SR&T Final Report which was color scanning and digitizing of 
colored polygons on maps with computer classification to extract the 
polygons. This method showed promise and it was decided to test the 
method under controlled color conditions. In the previous test a pastel 
colored printed map was used which had color dot printing patterns rather 
than solid colors resulting in noise color signals. 

The experiment carried out in the fourth quarter took as an example 
a complex forest operating area map which can not be successfully digi- 
tized by the manual method due to the complex shapes and small size of 
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Figure C-9b. 
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Figure C-9c. Correlation Matrices for Ground Truth Classes-Sugar Beets 
and Urban (Phoenix, AZ) . 
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Figure’ C-9d. Correlation Matrices for Ground Truth Class-Wheat 
(Phoenix, AZ) . 
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Figure C-lOb. Histograms for Class-Barley (Phoenix, AZ) . 
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Figure C-lOc. Histograms for Class-Cotton (Phoenix, 


AZ) 
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Figure C-lOd. Histograms for trass-Snibiis (Phoenix, -AZ) 
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Figure C-lOe. Histograms for Class-Sugar Beets (Phoenix, AZ) . 
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Figure C-lOf* Histograms for Class-Urban (Phoenix, AZ) 
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Figure C-lOg, Histograms for Class-Wheat (Phoenix, AZ) 



CLASS STATISTICS FOR GROUND TRUTH 


ClASS*,..ALrALFA 

TOTAL NUMBER OF SAMPLES. 

SPEcVral plot (MEAN PLUS AND MENUS ONE STO. DEV.I 


ASS 


SPECTRAL 0.0 30.00 72.00 108*0 IA4.0 UO.Q 216.0 ^ 252. 0 28B.0 

HlcBoMETeftS?"*"”*** * ***” * * * " ** ..-* ♦ j 

1 I 

0.50* O.frOi I 

O.60» 0.7ol ♦♦♦♦ I 

0.70* o.eoj *«..*.•«* I 

I 

...... I 

3,00- 3.001 J 

, , . * -• ‘ — . . 

0.0 36.00 72.00 lOS.O 1*6.0 IflO.O 216.0 252.0 208.0 


CLASS.... barley 

SPECTRAL'PLOT IMEAN plus AMO MINUS ONE STO. 

samples... 

DEV.k 

372 






SPECTRAL ‘8. 

36.00 72.00 

108*0 

lAA.O 

18D.0 

216.0 

252.0 

28£ 

!*0 

BANDS 

MXCROHEURS 
0*50- 0.60 

.. 








0.60- 0.70 









0.70- 0*80 









0.80- 1.10 

«•* 








3.00- 3«00 









O.i 

t 36.00 72.00 

lOA.O 

144. 0 

180,0 

216,0 

2S2.0 

2SI 

1.0 


CLASS.. ..COTTON NUH8ER OF 

SAMPLES... 

240 





spectral plot (MEAN PLUS' AND MINUS ONE STD. 

OEV.T 






SPECTRAL O.C 

1 36.00 72.00 

108.0 

144.0 

IBO.D 

216.0 

252.0 

26 E 

).0 

[ 

BANOS 

MICROMETERS 







! 


0.50- 0.60 

i *** 






i 


0.60- 0.70 

[ .4.. 

1 






*i 

] 


0.70- 0*60 

1 4444 








0.60- 1.10 

1 ** 








3.00* 3.00 


44444444 







0. 

) 36.00 72.00 

lOB.Q 

144.0 

UO.O 

216.0 

2 S 2.0 

28i 

1.0 


CLASS. ...OHIONS TOTAL HOHBEB OF SAHPLES... 

SPECTRAL PLOT (MEAN PLUS AND MINUS ONE STD. DEV. I 


396 


spectral 0.0 
MICROMETERsj 
0.50- 0.60" 

0.60- 0.70 
0,70- 0.00 

o.eo- 1.10 

3.007 3.00 


36.00 


T2.00 


J44.0 *2°*2 — - 


2BS.0 


fl.l *"'3S!o5"'*""7itoo *'."ioe:r" 


””I**Io * laolo 216.0 


2S2.0 


2BB.0 


Figure C-lla. Spectral Plots for Classes (Phoenix, AZ) 
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Figure C-llb. Spectral Plots for Classes (Phoenix, AZ) . 
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Figure C-llc. Spectral Plot for Classes (Phoenix, AZ) 
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areas relative to the grid size. One map segment was hand colored using 
acrylic polymer emulsion artists' paints which give bright solid colors. 
In the map segment chosen there were nineteen different areas requiring 
that many separable colors. Figure C-12 contains a black and white 
reproduction of the colored map. Separability of the darker colors is 
expected to be difficult. Three other sites are being colored and a 
brighter water based paint is being tested. 

The colored maps will be digitized on a microdensitometer and three 
band (blue, green and red separations) LARS MIST tape will be generated. 
LARSYS classification analysis will then be performed to attempt to 
extract the 19 polygon types from the data. If the classification is 
highly accurate then a promising alternative is available for digitizing 
complex maps. In this case it may be the only way the map can be digi- 
tized and grldded. 

6. SUMMARY AND CONCLUSIONS 


Analysis of the geometric characteristics of the aircraft SAR 
relative to Landsat indicated that relatively low order polynomials 
would model the distortions to sub-pixel accuracy to bring SAR into 
registration for good quality imagery, e.g., Phoenix Goodyear data. 

Also, the area analysed was small, about 10 miles square, so this is an 
additional constraint. For the Air Force/ERIM data from Maryland none 
of the tested methods could achieve sub-pixel accuracy. The reasons for 
this is unknown; however, the noisy (high scintillation) nature of 
the data and attendant unrecognizability of featuijes contribute to this 
error. Thus, the conclusion is that the quadratic model would adequately 
provide distortion modeling for small areas, i.e., 10 to 20 miles square. 
Note that in the Cambridge case going from quadratic to 5th order 
lowered the 47 point line error from 2.24 to only 2.17 pixels. Require- 
ments for larger areas, e.g., SEASAT frame, were not determined. 

The spectral nature of the SAR was Investigated with respect to 
crop fields in the Phoenix area and some separability was noted in 
histograms. Further analysis must await receipt of time coincident 
Landsat data. 


C-44 j 




ORIGINAL PAGE 16 
OF POOR QUALiry 


Figure C-12. 


Forest operating area map segment hand colored for scanning 
and digitizing. There are 19 different areas color coded 
with acrylic polymer paint. 


A color map digitizing and classification scheme was studied for 
converting ancillary map data to gridded digital form. Map coloring 
and digitizing was completed by November 30, 1978 and analysis will be 
carried out in the first quarter of the follow-on year. 
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Appendix C-1 

Comparison of LARS Affine and Wallops Systematic Error Model 

The systematic error model and LARS Affine programs model geometric 
distortion in an image with respect to a reference image. The programs 
model rotation angle, range scale, track scale, and shear angle distortions. 
An outline of the systematic error model program operation is described 
in the NASA/WALLOPS SYNTHETIC APERTURE RADAR IMAGE PROCESSING, SYSTEM 
PLAN. So it will not be repeated here. A flowchart of the program 
operation and of the program mathematics are provided in Figure C-13 and 
C-14, respectively. 


The systematic error model program can be shown to be essentially 
the same six parameter affine model used in the LARS AFFINE program. 

The following shows that the systematic error program is a six parameter 
affine model. 


Let. P = map track control point coordinates 
A 

Q = map range control point coorindates 
X = distorted track control point coordinates 
Y = distorted range control point coordinates 
P* = rotated track control point coordinates 
Q' = rotated range control point coordinates 


The mathematical description of the program provides the model: 


N 


P.’ = A,(X. + K(0.00005)Y.) + ( T (B^ + r^.))/N 
1 o X X o d1 

1=1 

Ql’ - + ( j (B3 + r3j))/N , 

1=1 

^ X 2 ^ 

where ^ minimum and where Kel such that J ^ 

N 2 i=l ^ 

minimum; also J ^ minimum. 


i=l 
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Figure C-13. Flowchart for NASA/Wallops Systematic Error Program 
(i.e., Subroutine SKEWDT) , 
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Figure C-14. Mathematical Model for Systematic Error Program. 
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This is a least-square approximation. First 


,...inf A'east-squares ^ 

- W . -Trr 


r. = 0. 
1 


The calculation 'oi';' Ax, and AY in the 


So, the calculation of A(*) = ^(.)* 
program is redundent in those cases where the scale and not the average 

scale is used in modeling. ’ With thi'’ ■’implification and allowing 
AL = K(0. 00005), the model becomes 


Pi' « A^(X. + M, Yi> + 

Qi' • Aj Yi + B3 


where the approximation is in the least squares sense. Introducing now 
the model of the rotation 

Pf’ = P^*cos(ARAD) - Q^*sin(ARAD) = Ag*(X^ + AL*Y^) + 

Qi’ = P^*sln(AEAD) + Q^*cos(ARAD) = + S 3 

where ARAD = angle of rotation A^ = track scale factor 

AL = shear A^ = range scale factor 

B, = translation in track B- = translation in range 

6 ^ 

Since ARAD is obtained by a least-squares approximation, the coordinates 
rotated and least-squares again applied, the model is overall a least- 
squares approximation. 

Solving the above equations for and Q^, 

P. = A,*cos(ARAD)*X^ + (Ao*sin(ARAD) + A,*AL*cos (ARAD) *Y 
X 0 i j o ^ 

+ Bg*cos(ARAD) + B 2 *sin(ARAD) . 

= (-Ag*sln(ARAD))*X^ + (-Ag*AL*sin(ARAD) +A 2 *cos (ARAD) ) *Y^ 
-Bg*sin(ARAD) + B^*cos(ARAD) 


or more simply 



= AF*X^ + BF*Y^ + CF 
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= DF*X^ + EF*Y^ + FF , 

which is a six parameter affine transformation. 

The LARS AFFINE program performs a six parameter least-squares 
fit for the delta functions 

A (X^,Y^) = X® - = a + a *X.-^ + a.*Y/ 

X o 1 1 2 1 

A (X^,Y^) = Y® - Y^ = b + b *x/ + b_*Y.^ , 
y o 1 i 2 i ’ 

where superscripts A and B denote RUNA image and RUNS image, respectively. 
When the transformation is implemented, for each point in the area in the 
RUNA image to be registered the delta functions are computed. This 
transforms the RUNA image coordinate (LANDSAT) to the RUNB to the RUNB 
image coordinates (SAR) . This determines the pixel (or interpolated 
pixel set) in the RUNB image to overlay at the corresponding RUNA coordi- 
nate position. This is the inverse operation of the systematic error 
model, if rhe P, Q (map coordinates) are regarded as the LANDSAT and the 
distorted image (S, Y coordinates), the SAR. Therefore, when residual 
errors were quoted in the Affine program, the errors are with respect to 
the RUNB or SAR image. When residual errors were quoted in the systematic 
error model program the errors are with respect to the X, Y or LANDSAT 
image. The resolution in the SAR image is usually much finer than that 
of the LANDSAT. So an error of 1 pixel in the LANDSAT image and quoted 
by the systematic error model program might map into an error of 3 
pixels in the SAR image and so stated by the Affine program. This is 
due to the scaling differences between the images. The circular error 

in each reference frame are related by (S a ,)^ + (S a .)^ = (a^^ + 

X xA y yA xB yB) . 

The following shows that if the checkpoint pairs are reversed in the 
systematic error model program, then the LARS Affine and the systematic 
error model program are identical. 
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The systematic program model has been shown to be 

P. = AF*X. + BF*Y. + CF 

1 1 X 

Q, = EF*X. + CF*Y. + FF 

XXX 

If the P, Q coordinate pair is' allowed to represent the RUNB 
coordinates and X, Y coordinates the RUNA coordinates, then 

X® = AF*X^ + BF*Y^ + CF 
Y^ = DF*X^ + EF*Y^ + FF. 

The model used for the LARS Affine program is 

A (X^,Y^) = X®-X^ = a +a*X^ + a.,*Y^ 

X o 1 '2 

A (X^,Y^) = Y® - Y^ = b + b^*X^ + b„*Y^. 
y o 1 ■ 2 

So, X® = a^ + (a^+l)*X^ + 

Y„ = b + b *X^ + Cb.,+1)*Y^. 

Box Z 

Therefore, the models are equivalent where 

a = CF a,. = AF-1 a„ = BF 

o 1 2 

b = FF b- = DF b„ = EF-1. 

o 1 2 

The program for the systematic error model was edited so that the 
reversal was obtained. A subroutine, AFFPAR, was amended to the 
systematic program to calculate the affine and LARS "delta" Affine 
parameters. Another subroutine, RESID, was also added to the systematic 
program to calculate residual errors between the initial map coordinates 
and rotated coordinates using the model. 

An example showing the equivalence of the two programs and an 
example showing corresponding changes in the r.m.s. error when the 
mapping is reversed are shown in Figure C-15 and Figure C-16. Here it 
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tRROR AFTER ROTATIOM. SCALE t TRARSLATl ONi L SHEAR 


TABLE 1 - RESIDUAL ERRORS 
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-0.43 

-0.66 

-0.16 

-1.42 

0.03 

-0.27 

-1.09 

1,50 

0.49 

-0.01 
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AVERAGE ERROR MAGNITUDE 
X « 0.6A10 

r « O.G005 

X VARUNCE « Q.B9J0Y VARIANCE 


THE TRACK DIRECTION SCALE KACTOW JS !■ -o.E<;i9 i THE RANGE OIRECtlON 

SCALE factor is H -O.EOIE, Tml 5«FA*1 IS -9.0^4 DEGREES 


RUNA 

LINE 


RUNA 

COLUMN 


305 » 

A8E, 

A9I, 

435. 

4 QE . 

7194 
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fi?0* 
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tU: 

821. 

20b, 
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»«•••••««• RFSIO CALCULATION OF RESIDUALS 
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AFFINE PARAMETERS 
Xl= 0.087639 

Yl» 0.203840 •K 


-0,17/78? 

0.112561 


912.2b971b 
774. 3 u<4689 


AFFINE DELTA PARAMETfcRS 
DELTA X= “0.912361 <»«A 

DELTA Yx 0.203640 4XA 


-0.172767 »YA « 912.289716 

-0.887419 *TA * 774.109689 


Figure C-15. 


Systematic Error Model Program Example Results with 
Checkpoint Pairs Reversed. 
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AFFINE TRANSFOan 


RUN A770Q9000 - RUN 720^^10^ 
CHANNEL A 1 CHANNEL 

•••• AT*A I* 1 II * I 


7415891), OOOOO 

6758377,00000 


1O946.O0Q0Q 



6756377,00000 

8669867.00000 


1154V. OOOOO 



10946.00000 
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ARAd-C 
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930.00000 
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670.009001083.00000 
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1032.00000 

113.00000 

627.O0000 

69.00000 

I.OOOOO 


972,00000 
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145.00000 



OETE-4MINANT 
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INVERSE * 



CULFICIEMT 

FMRDk a 

0«604dV 

0.00000 

0.00000 

-0.00106 

LINL 

-0.91 JJ70 



0,00000 

0.00000 

-0.00073 

COL^ 

-0.173547 



-0.00106 

-O.flOOM 

1.1663H 

CONS 

913.391073 




COMPUTED DELTA 


513.15694 

402.68676 

329.55993 

405.69939 

451.60690 

168.34277 

163.65976 

157.52626 

67.24152 

21S.64375 

-34.59956 

580.37093 

399.62395 

303.67056 

113.48203 

146.05940 


305.00000 

482.00000 

491.00000 

435.00000 

402.00000 

719.00000 

668.00000 

667.00000 

620.00000 
568.000001 


922.00000 

919.00000 

621.00000 

205.00000 

397.00000 
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67O.O00Q0 
827.00000 


701.00000 

406.00000 

780.00000 

636.00000 

544.00000 

509.00000 
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845.00000 
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1030.00000 
6)0,00000 
85.»,00000 

316.00000 

040.00000 

en.ooooo 
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.083.00000 

69.O0000 
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1.00000 

1,00000 

1,00000 

1.00000 

1.00000 

1.00000 
1.00000 
l.QOnOO 
1,00000 

1.00000 

1.00000 
f ,00000 
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1.00000 

f .00000 
.00000 
.ooooo 


PFTERminanT 


eib.ooooo 

685.00000 

821.00000 

842.00000 

855.00000 

888.00000 

652.00000 

624.00000 

868.001,100 

763.00000 

688.00000 

845.00000 

930.00000 

785.00000 

796.00000 

930.00000 

783.00000 
97c. OOUOO 

4.15V63U 14 


INVERSE 

0,00000 O.DOOOO 
0,00000 0.00000 
-0,00106 -D.000/3 


-0.00106 
-0,00073' 
I .16630 


cOLPicieriT 
LINL 0.204133 
cOL -O.80bo3O 

CnN5 774.908094 


COL delta 

COMPUTED delta 

ERROR 

216,00000 

214,23906 

-1,7609 

512,00000 

512.51653 

0,5165 

183,00000 

162.0060B 

-0.9939 

299. OOOOO 

298.53735 

-0,4627 

373,00000 

373,55492 

0.5549 

469,00000 

469,36723 

0.3672 

300,00000 

299,69164 

-0.1084 

161,00000 

160,17259 

-0.6274 

44^.00000 

444,66455 


-26,00000 

-24,43317 

421.00000 

206.00000 


-klltt 

660.00000 

n9, 91717 

-0«0828 

70.00000 

70.306ls 

0.3062 

81.00000 

»fl*?S22l 

0.9522 

7!<J. OOOOO 

0.1574 

-SI. OOOOO 

0,2910 

862.00000 

0.4109 


AFFINE COEFFICIENTS TO STSTEHATIC PARAMETERS 


LINE COEFFICIENT NQ.ls 
LINE COEFFICIENT M0,2» 

COLUMN COEFFICIENT N0.2« 
COLUMN COEFFICIENT NO. 3= 


9)3.39086914 

-0.91336995 

-0,17354697 

774.90795898 

0.20413297 

-0.68862997 


LINE TRANSLATION « 913.399669 

* 774*907959 

LINE SCALE FACTOR s 0.221754 

COLUMN SCALE FACTOR » 0.203264 

ANGLE OF ROTATION . «S62*S»|776 DECREES 

SHEAR a 0.170825 OR SHEAR ANGLE * 9.6940 DEGREES 


Figure C-16. LARS AFFINE Model Program Example Results with Checkpoint 
Pairs Reversed. 
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should be noted that the "Variance" shovm in the WALLOPS program descrip- 
tion outline and in the labeling of the printed results is actually the 

standard deviation not the variance. 

/ 

By comparing the results of the two programs, they are essentially 
the same model (allowing for small computational errors) . The differences 
noticed between the r.m.s. errors calculated by the systematic program 
and by RESID are due to the fact that errors computed in the program 
are with P&Q rotated with respect to X&Y. In RESID errors are computed 
with X, Y rotated with respect to P&Q. Therefore, a small error is 
interchanged between the line and column errors between the two 
calculations . 


The LARS Affine program obtains the model in total with only one 
least-squares fit, while the systematic program requires at least six to 
obtain the same result. The additional insight the systematic error 
program provides in printing rotation angle, scaling, and shear angle 
can be obtained in the LARS program with the addition of a simple 
subroutine calculation. The following is a derivation of the necessary 
subroutine calculations. 
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The AFFINE delta function 
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Solving these for 0^ S^, S^, and rtj 
0 = arc tan ( — 

a^TjL 


S 

y 


(b2+l) COS0 + a^sihO 


Hi+l 

S - ~ 

X COS0 

[a^GosO] - [(b2+i)sin0] 



In implementing these reldtibns the single ieast-sqtiares fit operation 
of the LARS Affine program will als 6 provide a parametric description of 
the distortion. Table C ~8 provides comparison of the systematic error 
model and the LARS affine model 4 The direction of the scaling and the 
angular rotations apparently differ -• They are actually the same. The 
LARS affine calculation of the parameter chddses the rotation and scale 
directions such that the line Scale factor is always positive. The small 
errors between the LARS affine And -systematic calculated residuals are a 
result of the systematic error model which rotates the reference and then 
scales and screws, where the LARS model- rotates the distorted image. 
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Table C-8. Comparison of WALLOPS Systematic Error Model and LARS Affine 
Model. 


FORWARD . REVERSED 








LARS 

AFFINE 

WALLOPS 

SYSTEMATIC 

LARS 

AFFINE 

WALLOPS 

SYSTEMATIC 

LINE R.M.S. 
ERROR 

3.82530 

4.1923 

(3.829)* 

0.80489 

0.8930 

(0.842)* 

COLUMN 

R.M.S. ERROR 

3.63939 

3.1940 

(3.622)* 

0.77234 

0.7615 

(0.818)* 

LINE 

TRANSLATION 

-5238.323913 

-5242.444701 

913.3910X2 

912.289716 

COLUMN 

TRANSLATION 

2646.570270 

2653.129237 

774.8094 

774.309689 

TRACK SCALE 

5.157719 

-5.1582 

0,221754 

-0.2219 

RANGE SCALE 

4.299145 

-4.2992 

0.203264 

-0.2032 

ROTATION 

ANGLE 

61.387711° 

-118.6180° 

-67.004776° 

113.2649° 

SHEAR 

ANGLE 

2.0788° 

-1.822° 

9.6940° 

-9.084° 


* LARS calculation of residual In systematic error program 
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